THE FIRST
APPLICATION ON RUBY
I'm starting to develop on Ruby, so the best way to undertand something is try to explain
The
Application: A Text Analyzer
Requirements,
Basic features:
Character count
Characters count (excluding spaces)
Line count
Word count
Sentence count
Paragraph count
Average number of words per sentence
Average number of sentence per paragraph
Building
the basic application:
When
starting to develop a new program it’s useful to think of the key steps involved.
In
the past it was common to draw flow
charts to show how the operation of a computer program would flow, but it's
easy to experiment, change things about, and remain agile with modern tools
such as Ruby. Let's outline the basic steps as follows:
1.-Load in a file containing the text or document you want to
analyze
2.-As you load the file line by line, keep a count of how many lines
were (one of your static taken care of)
3.-Put the text into a string and take the measure the length to get
your character count.
4.- Temporarily remove all whitespace and measure the length of the
resulting string to get the characters count excluding the whitespace
5.-Split out the whitespace to find how many words there
Are
6.-Split on full stops to find out how many sentence there are
7.-Split on double newlines to find out how many paragraphs there
are
8.-Perform calculations to work the avarenges
Step 1
Obtaining
some Dummy text
We can
find the text at http://rubyinside.com/book/oliver.txt
Step 2
Load the
text files and counting lines
################################
#analyzer.rb
first approximation
#Starts
in zero
Count = 0
#load the
file line by line
File.open("oliver.txt").each
{ |line| line_count += 1 }
#Prints
the count of the lines
Puts
line_count
################################
Then we look for a way to read the text, so we have to alternatives
File.open vs File.readlines, in the first aprox. I'm going to use File.open and read all the lines, at the end of the tuturial I'm going to change to File.readlines, readlines method has already the functions to count the lines
File.open vs File.readlines, in the first aprox. I'm going to use File.open and read all the lines, at the end of the tuturial I'm going to change to File.readlines, readlines method has already the functions to count the lines
################################
#analyzer.rb
second approximation
#Adding a
variable to collect the lines together as one as we go
Text = '
'
Count = 0
File.open("oliver.txt").each
do |line|
Lines_count += 1
Text << line
End
Puts
"#{line_count} lines"
################################
Remembering that using {and} to surrond blocks is the standard
style for single-line blocks, using "do" and "end" is preferred for
multiple blocks.
################################
#analyzer.rb
third approximation
#Text is
a string, we can use the length method that all strings
#supply
to get the exact of the file, and therefore the number of
#chacrters,
and then in other variable exclude the spaces
Text = '
'
Count = 0
File.open("oliver.txt").each
do |line|
Lines_count += 1
Text << line
End
Total_characters
= text.length
#using
gsub to eradicate the spaces from your text string in the
#same
way, and then use the length of the newly "de-spacfied"
#text
Total_characters_nospaces
= text.gsub(/\s+/, '').lenght
Puts
"#{line_count} lines"
Puts
"#{total_characters_nospace] characters excluding spaces"
Puts
"#{total_characters} Total characters"
################################
Counting
Words
1.- count the number of groups of contiguous letters using scan
2.- Split the text on whitespace and count the resulting fragments
using split and size
To get
the number of words in the string, I can use the length or size array methods
to count the number of elements rather than join them together:
#################################
Puts
"this is a test".scan(/\w+/).length
->> 4
#################################
The split
approach demostrates a core tenet of Ruby (as well as some other languages,
particulary Perl): ther's more than one way to do it! Analyzing different
methods to solve the same problem is a crucial part of becoming a good
programmer, as different methods can vary in their efficacy.
#################################
Puts
"this is a test".split.length
->> 4
#################################
################################
#analyzer.rb
fourth approximation
#Counting
words
Text = '
'
Count = 0
File.open("oliver.txt").each
do |line|
Lines_count += 1
Text << line
End
Word_count
= text.split.length
Total_characters
= text.length
Total_characters_nospaces
= text.gsub(/\s+/, '').lenght
Puts
"#{line_count} lines"
Puts
"#{total_characters_nospace] characters excluding spaces"
Puts
"#{total_characters} Total characters"
Puts
"#{word_count} words"
################################
COUNTING
SENTENCE AND PARAGRAPHS
Rather
than splitting on whitespace, sentence and paragraphs have different split
criteria.
Sentences
end with full stops, question marks, and exclamation marks. They can also be
separated with dashes and other punctuation.
###################################
#for end
regular expressions
Sentence_count
= text.split(/\.|\?|\!/).length
#For
double space
Puts
text.split(/\n\n/).length
###################################
################################
#analyzer.rb
fiveth approximation
#Counting
paragraphs and sentences
Text = '
'
Count = 0
File.open("oliver.txt").each
do |line|
Lines_count += 1
Text << line
End
Word_count
= text.split.length
Total_characters
= text.length
Total_characters_nospaces
= text.gsub(/\s+/, '').lenght
Sentence_count
= text.split(/\.|\?|!/).length
Paragraph_count
= text.split(/\n\n/).length
Puts
"#{line_count} lines"
Puts
"#{total_characters_nospace] characters excluding spaces"
Puts
"#{total_characters} Total characters"
Puts
"#{word_count} words"
Puts
"#{sentence_cunt} sentences"
Puts
"#{paragraph_count}
paragraphs"
################################
Calculating
avarages
################################
#analyzer.rb
sixth approximation
#Counting
avarages
Text = '
'
Count = 0
File.open("oliver.txt").each
do |line|
Lines_count += 1
Text << line
End
Word_count
= text.split.length
Total_characters
= text.length
Total_characters_nospaces
= text.gsub(/\s+/, '').lenght
Puts
"#{line_count} lines"
Puts
"#{total_characters_nospace] characters excluding spaces"
Puts
"#{total_characters} Total characters"
Puts
"#{word_count} words"
Puts
"#{sentence_count / paragraph_count} sentences per paragrapg
(avarage)"
Puts
"#{word_count / sentence_count } words per sentences (avarage)"
################################
The code:
################################
#analyzer.rb
#Open file and read it
lines = File.readlines("oliver.txt")
line_count = lines.size
text = lines.join
word_count = text.split.length
total_characters = text.length
total_characters_nospaces = text.gsub(/\s+/, '').lenght
puts "#{line_count} lines"
puts "#{total_characters_nospace] characters excluding spaces"
puts "#{total_characters} Total characters"
puts "#{word_count} words"
puts "#{sentence_count / paragraph_count} sentences per paragrapg (avarage)"
puts "#{word_count / sentence_count } words per sentences (avarage)"
################################
No hay comentarios:
Publicar un comentario