OCRing in Style
Recently, during our weekly Technology eXchangE (TeX) meeting, Erik organized a short Code Kata event. The problem we had to solve was ”OCR”. We had to parse input similar to this:
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
_ _ _ _ _ _
|_ ||_| | _| _||_||_ |
|_| ||_| | _||_ | _| |
Of course we did it in our regular pair-programming TDD fashion.
I decided to use Ruby as a language of choice to have more fun and also teach it to my partner in crime! Here is the result we produced during that hour (with tests of course!):
# ocr_spec.rb
require File.expand_path("ocr", File.dirname( __FILE__ ))
describe Ocr do
context ".parse" do
it "parses 1" do
Ocr.parse(
"
|
|
").should == 1
end
it "parses 2" do
Ocr.parse(
" _
_|
|_
").should == 2
end
it "parses 3" do
Ocr.parse(
" _
_|
_|
").should == 3
end
it "parses 4" do
Ocr.parse(
"
|_|
|
").should == 4
end
it "parses 5" do
Ocr.parse(
" _
|_
_|
").should == 5
end
it "parses 6" do
Ocr.parse(
" _
|_
|_|
").should == 6
end
it "parses 7" do
Ocr.parse(
" _
|
|
").should == 7
end
it "parses 8" do
Ocr.parse(
" _
|_|
|_|
").should == 8
end
it "parses 9" do
Ocr.parse(
" _
|_|
_|
").should == 9
end
it "parses 0" do
Ocr.parse(
" _
| |
|_|
").should == 0
end
end
context ".parse_file" do
it "single line" do
Ocr.parse_file("test_file_single_line.txt").should == [111669227]
end
it "multiple line file" do
Ocr.parse_file("test_file_multi_line.txt").should == [111669227, 846753707]
end
end
end
# test_file_single_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
# test_file_multi_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
# ocr.rb
class Ocr
MAPPINGS = [
" _ | ||_| ",
" | | ",
" _ _||_ ",
" _ _| _| ",
" |_| | ",
" _ |_ _| ",
" _ |_ |_| ",
" _ | | ",
" _ |_||_| ",
" _ |_| _| ",
]
def self.parse(input)
MAPPINGS.index input.split($/).join
end
def self.parse_file(file_name)
File.readlines(file_name)
.each_slice(4)
.map {|four_lines| four_lines
.map {|line| line
.chomp
.chars
.each_slice(3)
.to_a
}
.transpose
.map {|number_string| parse number_string.map(&:join)
.join($/) }.join.to_i }
end
end
The code above is definitely not production-ready! As you can see it is mostly one-liner. Good luck deciphering that! Here’s some reference material to help you out: Array#index, String#split, Array#join, IO.readlines, Enumerable#each_slice, Enumerable#map, String#chomp, String#chars, Enumerable#to_a, Array#transpose and String#to_i
Our recent stories
Flowers for a new generation - live in one month with AI as the pair programmer
How a long-time client's idea for reaching younger customers became our first project built systematically with AI from end to end - prototype, code, and content - while keeping every line reviewed and owned by us.
A bank built in 8 months - licence confirmed, deadline met
Build or buy? When the client came to us, that question was still open. We recommended building. No vendor lock-in, no paying for features you'll never use, no waiting on someone else's roadmap. A codebase you own. A product you control. They agreed. That put more weight on us - and we were fine with that.
The story of Mihkel and Misha creating the Fussball app
We now have our own internal app called Fussball. We talked to its creators, Misha and Mihkel, about what it does and what motivated them to spend their free time building it.