
OCRing in Style
Recently, during our weekly Technology eXchangE (TeX) meeting, Erik organized a short Code Kata event. The problem we had to solve was ”OCR”. We had to parse input similar to this:
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
_ _ _ _ _ _
|_ ||_| | _| _||_||_ |
|_| ||_| | _||_ | _| |
Of course we did it in our regular pair-programming TDD fashion.
I decided to use Ruby as a language of choice to have more fun and also teach it to my partner in crime! Here is the result we produced during that hour (with tests of course!):
# ocr_spec.rb
require File.expand_path("ocr", File.dirname( __FILE__ ))
describe Ocr do
context ".parse" do
it "parses 1" do
Ocr.parse(
"
|
|
").should == 1
end
it "parses 2" do
Ocr.parse(
" _
_|
|_
").should == 2
end
it "parses 3" do
Ocr.parse(
" _
_|
_|
").should == 3
end
it "parses 4" do
Ocr.parse(
"
|_|
|
").should == 4
end
it "parses 5" do
Ocr.parse(
" _
|_
_|
").should == 5
end
it "parses 6" do
Ocr.parse(
" _
|_
|_|
").should == 6
end
it "parses 7" do
Ocr.parse(
" _
|
|
").should == 7
end
it "parses 8" do
Ocr.parse(
" _
|_|
|_|
").should == 8
end
it "parses 9" do
Ocr.parse(
" _
|_|
_|
").should == 9
end
it "parses 0" do
Ocr.parse(
" _
| |
|_|
").should == 0
end
end
context ".parse_file" do
it "single line" do
Ocr.parse_file("test_file_single_line.txt").should == [111669227]
end
it "multiple line file" do
Ocr.parse_file("test_file_multi_line.txt").should == [111669227, 846753707]
end
end
end
# test_file_single_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
# test_file_multi_line.txt
_ _ _ _ _ _
| | ||_ |_ |_| _| _| |
| | ||_||_| _||_ |_ |
_ _ _ _ _ _ _ _
|_||_||_ ||_ _| || | |
|_| ||_| | _| _| ||_| |
# ocr.rb
class Ocr
MAPPINGS = [
" _ | ||_| ",
" | | ",
" _ _||_ ",
" _ _| _| ",
" |_| | ",
" _ |_ _| ",
" _ |_ |_| ",
" _ | | ",
" _ |_||_| ",
" _ |_| _| ",
]
def self.parse(input)
MAPPINGS.index input.split($/).join
end
def self.parse_file(file_name)
File.readlines(file_name)
.each_slice(4)
.map {|four_lines| four_lines
.map {|line| line
.chomp
.chars
.each_slice(3)
.to_a
}
.transpose
.map {|number_string| parse number_string.map(&:join)
.join($/) }.join.to_i }
end
end
The code above is definitely not production-ready! As you can see it is mostly one-liner. Good luck deciphering that! Here’s some reference material to help you out: Array#index, String#split, Array#join, IO.readlines, Enumerable#each_slice, Enumerable#map, String#chomp, String#chars, Enumerable#to_a, Array#transpose and String#to_i
Our recent stories
Weekly TeX - how knowledge sharing shapes our culture
Weekly Technology Exchange (TeX) sessions are an integral part of Codeborne’s DNA. These sessions provide our team with the opportunity to share new knowledge, present interesting discoveries, and discuss the latest technological advancements.
The story of HIIT Pekk
At Codeborne, we believe that work should be balanced with fun and fitness. Our journey with high-intensity interval training (HIIT) began as a creative response to the challenges posed by the COVID-19 pandemic.
Transforming the workflow for financial reporting
From Word documents to a unified platform: Tackling workflow inefficiencies