close
6 Pages
6Pages write-ups are some of the most comprehensive and insightful I’ve come across – they lay out a path to the future that businesses need to pay attention to.
— Head of Deloitte Pixel
At 500 Startups, we’ve found 6Pages briefs to be super helpful in staying smart on a wide range of key issues and shaping discussions with founders and partners.
— Thomas Jeng, Director of Innovation & Partnerships, 500 Startups
6Pages is a fantastic source for quickly gaining a deep understanding of a topic. I use their briefs for driving conversations with industry players.
— Associate Investment Director, Cambridge Associates
Read by
BCG
500 Startups
Used at top MBA programs including
Stanford Graduate School of Business
University of Chicago Booth School of Business
Wharton School of the University of Pennsylvania
Kellogg School of Management at Northwestern University
All Briefs
See more briefs

Reading Time Estimate
12 min read
1. Synthetic data is becoming pervasive in AI
  • So far this year, there’s already been $170M+ in venture funding for synthetic data startups – already reaching an annual high with more than half the year still left. (The prior 5 years saw a total of $210M in funding.) This recent uptick in funding for synthetic data startups has come in parallel with the growth of AI.
  • Traditionally, training AI has required vast volumes of high-quality data. Synthetic data – algorithmically generated data that mirrors the statistical characteristics of real-world data – is a technique to reduce the data needed to train AI models by as much as 70-90%. According to Gartner, by 2024, 60% of data used to develop AI models will be synthetically generated. By 2027, the market for synthetic data is projected to reach $1.2B (up from $110M in 2021).
  • There are at least 76+ startups working on synthetic data (as well as an array of open-source tools like the Synthetic Data Vault libraries). This year alone, we’ve seen funding rounds for Synthesis AI ($17M), Datagen ($50M), Synthetaic ($13M), Mindtech ($3.7M), MDClone ($63M), and MOSTLY AI ($25M). Many are focusing on use cases in computer vision, such as facial recognition and satellite imagery analysis.
  • Synthetic data can be structured (e.g. tables, spreadsheets) or unstructured (e.g. faces, 3D environments, audio, text). Some startups – particularly those in Europe – offer privacy-preserving datasets, while others are generating test data where privacy is less important.
  • Synthetic data is increasingly being generated via AI rather than SQL and common programmatic languages. Using AI techniques like generative adversarial networks (GANs), one startup can generate 10M+ labeled images. GANs pit two unsupervised neural networks against each other in a feedback loop. The generator creates synthetic data from real-world inputs, and the discriminator tries to figure out what is fake. The two learn over thousands of iterations until the generator’s output is convincingly real. Not all synthetic data is developed from real-world data – e.g. Rendered.ai generates data from physics equations (e.g. how light interacts with matter).
  • Similarly, with the rise of the metaverse, synthetic data can be used to train the algorithms that will power the environments and interactions in the future metaverse – the primary rationale for Meta’s AI.Reverie acquisition. Synthesis AI has generated 100K “synthetic people” that can be used to develop more emotive and realistic avatars. “Emotion AI” – using AI on facial micromovements and body language to detect people’s emotions – is a particularly active and controversial use case.
  • Despite the buzz, the reality remains that synthetic data is still not as effective as real-world data in training high-accuracy AI systems. Datasets that are highly representative and deeply anonymized are hard to generate. The quality of the data is highly dependent on the quality of the process/AI creating it. Poor synthetic data can result in algorithms that make low-quality and even unsafe recommendations.
  • One study found that 92% of AI models trained on synthetic data had lower accuracy than those trained on real data. Models trained on synthetic data had a 6-19% deviation in accuracy compared to models trained on real data. There’s also the risk that the synthetic data misses key patterns, insights and opportunities – or turns up patterns not in the original data. For some companies, these downsides may be “manageable,” while for others they may be unacceptable.
Related Content:
  • Apr 15 2022 (3 Shifts): Google’s Transformer-based Pathways Language Model is breaking new ground
  • Apr 8 2022 (3 Shifts): Regulators want to destroy ill-gotten AI models and data when businesses break rules
2. Shein is the world's 3rd-most valuable private startup – driven by ultra-fast fashion and social commerce
  • The $100B valuation – more than the combined worth of established fast-fashion players H&M and Zara – is reflective of Shein’s growth to become a dominant apparel player. Founded in 2008 and re-branded in 2015, Shein has only risen to global prominence over the past few years. In 2016, its revenue was just $600M+. By 2019, Shein had reached $3.2B in revenue and $5B in valuation. Just two years later, its revenue had quintupled to $16B in 2021. This year, despite recent slowing growth, it’s projected to reach an eye-watering $20B in revenue.
  • Shein’s success comes from piggybacking on – and in some cases turbocharging – popular retail trends in near real-time. Its apparel selection is powered by AI, which identifies trends and analyzes demand patterns to inform new styles. Shein takes fast fashion to the next level, updating its website with an average of 6K new items every day – an endless torrent of current designs. For context, over a recent 12-month period, Gap listed 12K different items on its site, H&M had 25K, and Zara had 35K. Shein, in contrast, had 1.3M items. They also are more fresh – 40-70% of its assortment is under 3 months old.
  • Shein is able to bring new styles to market extremely quickly. Located near one of China’s apparel manufacturing hubs in Guangzhou, Shein can closely manage its production chain from design to manufacturing, with high digitization and integration between each step. This means it can design a collection in as little as 3 days, and go from concept to production in under 2 weeks and as short as 5-7 days.
  • As a mostly-online, direct-to-consumer brand, Shein runs lean with limited overhead, initially placing small orders of just 30-200 pieces for new items. Its custom software automatically orders more if an item sells well or terminates production if not. Shein sends items directly from its Chinese factories to customers globally. In the US – Shein’s largest market – it takes advantage of a regulatory loophole that lets it avoid import tariffs on shipments worth less than $800. (The exemption was raised from $200 to $800 in 2016.)
  • Because of the way it operates, Shein’s assortment is substantially cheaper than its rivals. The median price for a dress on Shein is $13, compared with $50 for Zara and $30 for H&M. Some categories routinely sell for $5 or less. While its revenue has been growing, its profit margins are relatively thin – reportedly just 5% as of last year.
  • Wildly popular among the Gen Z Instagram/TikTok crowd, Shein deploys a marketing strategy centered on social commerce. It enrolls a large number of influencers (including lesser-known ones with medium-sized followings) who churn out content to help sell styles, providing them with 10-20% commission, free clothes, and follower discount codes. In addition, it also utilizes search engine optimization, livestream events, gamified interactions and a point-based incentive system. Shein is the most talked-about brand on Instagram, YouTube, and TikTok – on TikTok, #sheinhaul videos have amassed 5.2B+ views to date.
  • Shein is taking steps to mitigate the risks to its business model. It has built a team of 100+ staff to review designs as well as a large in-house design team, since many of its copycat and IP issues stem from its independent-supplier ecosystem. It also recently opened a new distribution center in Indiana, which could offer some operational flexibility if the de minimis exemption is eliminated.
  • For now, Shein still has room to grow – though perhaps slower than in prior years. It only has 7M+ monthly active users in the US, out of the 230M+ online-shopping consumers. Operationally, relatively few rivals can do what Shein can do – at least right now, though that will change. It has some time, especially given its most recent round of funding, to respond to its critics and adapt to the changing environment. In the meantime, its success with the “TikTokification” of retail is likely to influence rival retailers’ strategies, while its “fast fashion 2.0” is likely to inspire analogous “fast manufacturing” in other domains.
Related Content:
  • Sep 24 2021 (3 Shifts): Livestream shopping – worth hundreds of billions globally – is coming to America
  • Oct 1 2021 (3 Shifts): Warby Parker & other online DTC brands are seeing the value of physical retail
3. Shopify’s push into fulfillment and digital advertising – head-to-head against Amazon
  • Last week, Shopify’s Q1 2022 earnings report fell short of expectations. Despite quarterly revenue growing to $1.2B (up 22% year-over-year), Shopify’s results missed analyst expectations for both revenue and earnings. Gross merchandise volume grew 16% year-over-year to $43B, short of the expected $46.5B. Perhaps in an effort to counter the disappointing report, Shopify coupled its earnings with major announcements in fulfillment and digital advertising.
  • The two announcements put Shopify head-to-head against Amazon, which has substantial businesses in both fulfillment services and advertising. In Apr 2022, Amazon unveiled its Buy with Prime program – called “Amazon as a service” by some. The program extends the “fast, free delivery” of Prime delivery and convenience of Amazon-enabled checkout to Prime members shopping on merchants’ own websites. The program will eventually be rolled out to even merchants not selling on Amazon.com or using its fulfillment services. Similar to Amazon’s AWS, sellers’ data is protected and they pay only for what they use (e.g. fulfillment, storage, payment processing).
  • Viewed as a direct shot against Shopify in going after direct-to-consumer brands, Buy with Prime ignited speculation that Shopify might opt to coexist with the program in its merchants’ best interests. Even if that turns out to be the case, the Deliverr announcement suggests that Shopify doesn’t plan on standing down its fulfillment ambitions – despite being up against the Amazon logistics juggernaut.
  • But advertising tools from Shopify could be good for its merchants, depending on how they are implemented. Shopify merchants already spend 10-30% of their revenue on Facebook and Instagram ads – collectively representing tens of billions of dollars (and the largest segment of ecommerce ad dollars going to Meta). Many merchants are looking for alternate venues for their ad spend, especially given the negative impact of Apple’s App Tracking Transparency on Facebook ad performance. While right now Shopify is just offering a value-added data exchange, there’s a pathway for it to build upon that foothold with an expanded suite of advertising tools and analytics.
Related Content:
  • Jan 21 2022 (3 Shifts): Shopify partners with JD.com to help small businesses reach Chinese consumers
  • Feb 26 2021 (3 Shifts): Walmart & Amazon team up with SaaS ecommerce platforms
Disclosure: Contributors have investment interests in Meta, Microsoft, Apple, and Alphabet. Amazon and Google are vendors of 6Pages.
Have a comment about this brief or a topic you'd like to see us cover? Send us a note at tips@6pages.com.
All Briefs
See more briefs

Get unlimited access to all our briefs.
Make better and faster decisions with context on far-reaching shifts.
Become a Member
Become a Member
Get unlimited access to all our briefs.
Make better and faster decisions with context on what’s changing now.
Become a Member
Become a Member