OCRing in Style
Recently, during our weekly Technology eXchangE (TeX) meeting, Erik organized a short Code Kata event. The problem we had to solve was ”OCR”. We had to parse input similar to this:
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
_ _ _ _ _ _
|_ ||_| | _| _||_||_ |
|_| ||_| | _||_ | _| |
Of course we did it in our regular pair-programming TDD fashion.
I decided to use Ruby as a language of choice to have more fun and also teach it to my partner in crime! Here is the result we produced during that hour (with tests of course!):
# ocr_spec.rb
require File.expand_path("ocr", File.dirname( __FILE__ ))
describe Ocr do
context ".parse" do
it "parses 1" do
Ocr.parse(
"
|
|
").should == 1
end
it "parses 2" do
Ocr.parse(
" _
_|
|_
").should == 2
end
it "parses 3" do
Ocr.parse(
" _
_|
_|
").should == 3
end
it "parses 4" do
Ocr.parse(
"
|_|
|
").should == 4
end
it "parses 5" do
Ocr.parse(
" _
|_
_|
").should == 5
end
it "parses 6" do
Ocr.parse(
" _
|_
|_|
").should == 6
end
it "parses 7" do
Ocr.parse(
" _
|
|
").should == 7
end
it "parses 8" do
Ocr.parse(
" _
|_|
|_|
").should == 8
end
it "parses 9" do
Ocr.parse(
" _
|_|
_|
").should == 9
end
it "parses 0" do
Ocr.parse(
" _
| |
|_|
").should == 0
end
end
context ".parse_file" do
it "single line" do
Ocr.parse_file("test_file_single_line.txt").should == [111669227]
end
it "multiple line file" do
Ocr.parse_file("test_file_multi_line.txt").should == [111669227, 846753707]
end
end
end
# test_file_single_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
# test_file_multi_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
# ocr.rb
class Ocr
MAPPINGS = [
" _ | ||_| ",
" | | ",
" _ _||_ ",
" _ _| _| ",
" |_| | ",
" _ |_ _| ",
" _ |_ |_| ",
" _ | | ",
" _ |_||_| ",
" _ |_| _| ",
]
def self.parse(input)
MAPPINGS.index input.split($/).join
end
def self.parse_file(file_name)
File.readlines(file_name)
.each_slice(4)
.map {|four_lines| four_lines
.map {|line| line
.chomp
.chars
.each_slice(3)
.to_a
}
.transpose
.map {|number_string| parse number_string.map(&:join)
.join($/) }.join.to_i }
end
end
The code above is definitely not production-ready! As you can see it is mostly one-liner. Good luck deciphering that! Here’s some reference material to help you out: Array#index, String#split, Array#join, IO.readlines, Enumerable#each_slice, Enumerable#map, String#chomp, String#chars, Enumerable#to_a, Array#transpose and String#to_i
Our recent stories
A bank built in 8 months - licence confirmed, deadline met
Build or buy? When the client came to us, that question was still open. We recommended building. No vendor lock-in, no paying for features you'll never use, no waiting on someone else's roadmap. A codebase you own. A product you control. They agreed. That put more weight on us - and we were fine with that.
The story of Mihkel and Misha creating the Fussball app
We now have our own internal app called Fussball. We talked to its creators, Misha and Mihkel, about what it does and what motivated them to spend their free time building it.
The Codeborne Christmas beer brewing diaries
It was a sunny day in September. Quite warm for that time of year. We were sitting with my colleague Tiit on the roof terrace in the Codeborne office as we do every now and then. I ask him for advice on occasion - after all, what are the more experienced colleagues good for otherwise?