OCRing in Style
Recently, during our weekly Technology eXchangE (TeX) meeting, Erik organized a short Code Kata event. The problem we had to solve was ”OCR”. We had to parse input similar to this:
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
_ _ _ _ _ _
|_ ||_| | _| _||_||_ |
|_| ||_| | _||_ | _| |
Of course we did it in our regular pair-programming TDD fashion.
I decided to use Ruby as a language of choice to have more fun and also teach it to my partner in crime! Here is the result we produced during that hour (with tests of course!):
# ocr_spec.rb
require File.expand_path("ocr", File.dirname( __FILE__ ))
describe Ocr do
context ".parse" do
it "parses 1" do
Ocr.parse(
"
|
|
").should == 1
end
it "parses 2" do
Ocr.parse(
" _
_|
|_
").should == 2
end
it "parses 3" do
Ocr.parse(
" _
_|
_|
").should == 3
end
it "parses 4" do
Ocr.parse(
"
|_|
|
").should == 4
end
it "parses 5" do
Ocr.parse(
" _
|_
_|
").should == 5
end
it "parses 6" do
Ocr.parse(
" _
|_
|_|
").should == 6
end
it "parses 7" do
Ocr.parse(
" _
|
|
").should == 7
end
it "parses 8" do
Ocr.parse(
" _
|_|
|_|
").should == 8
end
it "parses 9" do
Ocr.parse(
" _
|_|
_|
").should == 9
end
it "parses 0" do
Ocr.parse(
" _
| |
|_|
").should == 0
end
end
context ".parse_file" do
it "single line" do
Ocr.parse_file("test_file_single_line.txt").should == [111669227]
end
it "multiple line file" do
Ocr.parse_file("test_file_multi_line.txt").should == [111669227, 846753707]
end
end
end
# test_file_single_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
# test_file_multi_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
# ocr.rb
class Ocr
MAPPINGS = [
" _ | ||_| ",
" | | ",
" _ _||_ ",
" _ _| _| ",
" |_| | ",
" _ |_ _| ",
" _ |_ |_| ",
" _ | | ",
" _ |_||_| ",
" _ |_| _| ",
]
def self.parse(input)
MAPPINGS.index input.split($/).join
end
def self.parse_file(file_name)
File.readlines(file_name)
.each_slice(4)
.map {|four_lines| four_lines
.map {|line| line
.chomp
.chars
.each_slice(3)
.to_a
}
.transpose
.map {|number_string| parse number_string.map(&:join)
.join($/) }.join.to_i }
end
end
The code above is definitely not production-ready! As you can see it is mostly one-liner. Good luck deciphering that! Here’s some reference material to help you out: Array#index, String#split, Array#join, IO.readlines, Enumerable#each_slice, Enumerable#map, String#chomp, String#chars, Enumerable#to_a, Array#transpose and String#to_i
Our recent stories
Partnering with Nicigas in their Energy Development Business
In Codeborne we have built energy information systems in Estonia, Sweden, Norway, Denmark, Luxembourg, Austria, and Japan. With Nicigas we combined our expertise and skillset to create something new on the Japanese market.
From wine tasting to digital innovation- the birth of the Wine Experience Club
We recently sat down with our client, Rait Maasikas, to go deeper into the story behind one of our more unusual recent projects- the Wine Experience Club.
Innovating the Austrian energy market with Spotty Smart Energy Partner
Spotty Smart Energy Partner GmbH, an Austrian energy provider, partnered with Codeborne to enhance their services and bring innovative energy solutions to the market