Code is everywhere you look these days - it’s being written, copied, pasted and shared on websites, in videos and articles, in messengers like Slack or Discord, and, of course, in your tools. In the most frustrating cases, the code that you might want to rip and re-use in your project is buried inside an image or a YouTube video, requiring you to type each character out manually.
We’re happy to eliminate that annoyance forever with the launch of CodeFromScreenshot.com!
First released inside our flagship AI-powered code snippet tool called Pieces.app (which is incredibly awesome and gives you superhuman speed when saving, re-using and sharing snippets), we decided to let Pieces’s code extraction feature shine on its own with its very own web-based utility site.
Codefromscreenshot.com provides a quick way to extract code from screenshots. It can even identify the language of the extracted code! We designed this site to be fast, easy-to-use and ad-free. This is the culmination of 6 months of development work on our very own Runtime.dev web APIs and intense work by our machine learning team to build OCR models that are fine-tuned to interpret technical language (i.e. code) rather than natural language.
You may be wondering:
- Why are people saving or sending screenshots of code in the first place???
- Can’t I just embed code?
- Why not use other OCR tools?
Why are people saving or sending screenshots of code in the first place???
Inside the Pieces dev team, we’re constantly sharing screenshots of code and error traces in Slack. Screenshots are preferred in many cases because they preserve text formatting and hence communicate more than just the text. Look at these two error messages submitted in Slack, and tell me which one communicates why my C++ code won’t compile :)
Can’t I just embed code?
Developer-focused content creators for publications like Medium may prefer to spend more time developing code and less time formatting it. Medium’s code embedding features are lacking— to get full syntax highlighting, you need to use an external code embedding service. Some content creators opt for screenshots of their work instead, but readers want to use the code without having to reproduce it character-by-character.
Check out the hoops some people jump through to embed and format code on Medium in this article.
Why did we build our own OCR models rather than use existing OCR tools?
While most available OCR models were developed for plain text extraction from images, we found the models underserved developers whose collections of images and screenshots of code tend to have code-specific punctuation. This punctuation is missed by common OCR models, which were designed for plain text. We painstakingly designed our OCR models for code extraction.
P.S. We want to shout out some of the other web utilities we took inspiration from, including:
- https://tinypng.com/ - minifies PNGs to enable your web content to load faster
- https://www.minifier.org/ - minifies JS and CSS
- https://www.diffchecker.com/ - check for differences between two samples of text
- https://regex-generator.olafneumann.org/ - generate a regular expression from sample text
Lead photo from FreeCodeCamp's Dart Programming Tutorial on Youtube