![Blog post cover illustration OCRing in Style](/media/2014/WP_000262.jpg)
OCRing in Style
Recently, during our weekly Technology eXchangE (TeX) meeting, Erik organized a short Code Kata event. The problem we had to solve was ”OCR”. We had to parse input similar to this:
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
_ _ _ _ _ _
|_ ||_| | _| _||_||_ |
|_| ||_| | _||_ | _| |
Of course we did it in our regular pair-programming TDD fashion.
I decided to use Ruby as a language of choice to have more fun and also teach it to my partner in crime! Here is the result we produced during that hour (with tests of course!):
# ocr_spec.rb
require File.expand_path("ocr", File.dirname( __FILE__ ))
describe Ocr do
context ".parse" do
it "parses 1" do
Ocr.parse(
"
|
|
").should == 1
end
it "parses 2" do
Ocr.parse(
" _
_|
|_
").should == 2
end
it "parses 3" do
Ocr.parse(
" _
_|
_|
").should == 3
end
it "parses 4" do
Ocr.parse(
"
|_|
|
").should == 4
end
it "parses 5" do
Ocr.parse(
" _
|_
_|
").should == 5
end
it "parses 6" do
Ocr.parse(
" _
|_
|_|
").should == 6
end
it "parses 7" do
Ocr.parse(
" _
|
|
").should == 7
end
it "parses 8" do
Ocr.parse(
" _
|_|
|_|
").should == 8
end
it "parses 9" do
Ocr.parse(
" _
|_|
_|
").should == 9
end
it "parses 0" do
Ocr.parse(
" _
| |
|_|
").should == 0
end
end
context ".parse_file" do
it "single line" do
Ocr.parse_file("test_file_single_line.txt").should == [111669227]
end
it "multiple line file" do
Ocr.parse_file("test_file_multi_line.txt").should == [111669227, 846753707]
end
end
end
# test_file_single_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
# test_file_multi_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
# ocr.rb
class Ocr
MAPPINGS = [
" _ | ||_| ",
" | | ",
" _ _||_ ",
" _ _| _| ",
" |_| | ",
" _ |_ _| ",
" _ |_ |_| ",
" _ | | ",
" _ |_||_| ",
" _ |_| _| ",
]
def self.parse(input)
MAPPINGS.index input.split($/).join
end
def self.parse_file(file_name)
File.readlines(file_name)
.each_slice(4)
.map {|four_lines| four_lines
.map {|line| line
.chomp
.chars
.each_slice(3)
.to_a
}
.transpose
.map {|number_string| parse number_string.map(&:join)
.join($/) }.join.to_i }
end
end
The code above is definitely not production-ready! As you can see it is mostly one-liner. Good luck deciphering that! Here’s some reference material to help you out: Array#index, String#split, Array#join, IO.readlines, Enumerable#each_slice, Enumerable#map, String#chomp, String#chars, Enumerable#to_a, Array#transpose and String#to_i
Our recent stories
How we built a virtual power plant
How Codeborne helped Alexela to transform Estonia's energy scene with Smart Electricity, an innovative virtual power plant that promotes smarter energy use
Public Key Infrastructure from scratch in 2 months
How we enabled IuteCredit customers to sign agreements using their mobile phone’s biometric data
1 app. 5 countries. In less than a year
How we helped IuteCredit scale their business in 5 countries