Module: PlainText
- Includes:
- Util
- Defined in:
- lib/plain_text.rb,
lib/plain_text/part.rb,
lib/plain_text/util.rb,
lib/plain_text/error.rb,
lib/plain_text/split.rb,
lib/plain_text/parse_rule.rb,
lib/plain_text/builtin_type.rb,
lib/plain_text/part/boundary.rb,
lib/plain_text/part/paragraph.rb,
lib/plain_text/part/string_type.rb
Overview
Utility methods for mainly line-based processing of String
This module contains methods useful in processing a String object of a text file, that is, a String that contains an entire or a multiple-line part of a text file. The methods include normalizing the line-break codes, removing extra spaces from each line, etc. Many of the methods work on tha basis of a line. For example, #head and #tail methods work like the respective UNIX-shell commands, returning a specified line at the head/tail parts of self.
Many of the methods contained directly in this module are meant to be included in String. Obviously, though, it is debatable if it is a good practice to include a third-party module in the core class.
Several module functions are also available. This module contains a helper module function PlainText.extend_this, with which an object extends this module easily as Singleton if this module is not already included.
A few methods in this module assume that Split is included in String, which in default is the case, as soon as this file is read (by Ruby’s require). The specification may be subject to change in the future release.
Defined Under Namespace
Modules: BuiltinType, Split, Util Classes: ParseRule, Part, PartNormalizeError
Constant Summary collapse
- DefLineBreaks =
List of the default line breaks.
[ "\r\n", "\n", "\r" ]
- DEF_HEADTAIL_N_LINES =
10
- DEF_METHOD_OPTS =
Default options for class/instance methods
{ :clean_text => { preserve_paragraph: true, boundary_style: true, # If unspecified, will be replaced with lb_out * 2 lbs_style: :truncate, lb_is_space: false, sps_style: :truncate, delete_asian_space: true, linehead_style: :none, linetail_style: :delete, firstlbs_style: :delete, lastsps_style: :truncate, lb: $/, lb_out: nil, # If unspecified, will be replaced with lb }, :count_char => { lbs_style: :delete, linehead_style: :delete, lastsps_style: :delete, lb_out: "\n", }, }
Class Method Summary collapse
-
.__call_inst_method__(method, instr, *rest, **k) ⇒ #instr
Call instance method as a Module function.
-
.clean_text(prt, preserve_paragraph: , boundary_style: , lbs_style: , lb_is_space: , sps_style: , delete_asian_space: , linehead_style: , linetail_style: , firstlbs_style: , lastsps_style: , lb: , lb_out: , is_debug: false) ⇒ Object
Cleans the text.
-
.count_char(instr, *rest, lbs_style: , linehead_style: , lastsps_style: , lb_out: , **k) ⇒ Integer
Count the number of characters.
-
.delete_spaces_bw_cjk_european(instr, *rest) ⇒ Object
Module function of #delete_spaces_bw_cjk_european.
-
.extend_this(obj) ⇒ TrueClass, NilClass
If the class of the obj does not “include” this module, do so in the singular class.
-
.head(instr, *rest, **k) ⇒ Object
Module function of #head.
-
.head_inverse(instr, *rest, **k) ⇒ Object
Module function of #head_inverse.
-
.normalize_lb(instr, *rest, **k) ⇒ Object
Module function of #normalize_lb.
-
.tail(instr, *rest, **k) ⇒ Object
Module function of #tail.
-
.tail_inverse(instr, *rest, **k) ⇒ Object
Module function of #tail_inverse.
Instance Method Summary collapse
-
#count_char(*rest, **k) ⇒ Integer
Count the number of characters.
-
#delete_spaces_bw_cjk_european(*rest) ⇒ Object
Non-destructive version of #delete_spaces_bw_cjk_european!.
-
#delete_spaces_bw_cjk_european!(repl = "") ⇒ MatchData, NilClass
Delete all the spaces between CJK and European characters or numbers.
-
#head(num_in = DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) ⇒ String
Returns the first num lines (or characters, bytes) or before the last n-th line.
-
#head!(*rest, **key) ⇒ self
Destructive version of #head.
-
#head_inverse(*rest, **key) ⇒ Object
Inverse of head - returns the content except for the first num lines (or characters, bytes).
-
#head_inverse!(*rest, **key) ⇒ self
Destructive version of #head_inverse.
-
#normalize_lb(*rest, **k) ⇒ Object
Non-destructive version of #normalize_lb!.
-
#normalize_lb!(repl = $/, lb_from: nil) ⇒ MatchData, NilClass
Normalizes line-breaks.
-
#strip_at_lines(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines!.
-
#strip_at_lines!(strip_head: true, strip_tail: true, markdown: false, linebreak: $/) ⇒ self, NilClass
String#strip! for each line.
-
#strip_at_lines_head(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines_head!.
-
#strip_at_lines_head!(linebreak: $/) ⇒ self, NilClass
String#strip! for each line but only for the head part (NOT tail part).
-
#strip_at_lines_tail(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines_tail!.
-
#strip_at_lines_tail!(markdown: false, linebreak: $/) ⇒ self, NilClass
String#strip! for each line but only for the tail part (NOT head part).
-
#tail(num_in = DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) ⇒ String
Returns the last num lines (or characters, bytes) or of and after the first n-th line.
-
#tail!(*rest, **key) ⇒ self
Destructive version of #tail.
-
#tail_inverse(*rest, **key) ⇒ Object
Inverse of tail - returns the content except for the first num lines (or characters, bytes).
-
#tail_inverse!(*rest, **key) ⇒ self
Destructive version of #tail_inverse.
Class Method Details
.__call_inst_method__(method, instr, *rest, **k) ⇒ #instr
Call instance method as a Module function
The return String includes PlainText as Singleton.
70 71 72 73 74 |
# File 'lib/plain_text.rb', line 70 def self.__call_inst_method__(method, instr, *rest, **k) newself = instr.clone PlainText.extend_this(newself) newself.public_send(method, *rest, **k) end |
.clean_text(prt, preserve_paragraph: , boundary_style: , lbs_style: , lb_is_space: , sps_style: , delete_asian_space: , linehead_style: , linetail_style: , firstlbs_style: , lastsps_style: , lb: , lb_out: , is_debug: false) ⇒ Object
Cleans the text
Such as, removing extra spaces, normalising the linebreaks, etc.
In default,
-
Paragraphs (more than 2
\n
) are taken into account (one\n
between two): preserve_paragraph=true -
Blank lines are truncated into one line with no white spaces: boundary_style=lb_out*2(=$/*2)
-
Consecutive white spaces are truncated into a single space: sps_style=:truncate
-
White spaces before or after a CJK character is deleted: delete_asian_space=true
-
Preceding white spaces in each line are preserved: linehead_style=:none
-
Trailing white spaces in each line are deleted: linetail_style=:delete
-
Line-breaks at the beginning of the entire input string are deleted: firstlbs_style=:delete
-
Trailing white spaces and line-breaks at the end of the entire input string are truncated into a single linebreak: lastsps_style=:truncate
For a String with predominantly CJK characters, the following setting is recommended:
-
lbs_style: :delete
-
delete_asian_space: true (Default)
Note for the Symbols in optional arguments, the Symbol with the first character only is accepted, e.g., :d
instead of :delete
(nb., :t2
for :truncate2
).
For more detail, see the description of each command-line options.
Note that for the case of traditional genko-yoshi-style Japanese texts with “jisage” for each new paragraph marking a new paragraph, probably the best way is to make your own Part instance to give to this method, where the rule for the Part should be something like:
/(\A[[:blank:]]+|\n[[:space:]]+)/
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 |
# File 'lib/plain_text.rb', line 153 def self.clean_text( prt, preserve_paragraph: DEF_METHOD_OPTS[:clean_text][:preserve_paragraph], boundary_style: DEF_METHOD_OPTS[:clean_text][:boundary_style], # If unspecified, will be replaced with lb_out * 2 lbs_style: DEF_METHOD_OPTS[:clean_text][:lbs_style], lb_is_space: DEF_METHOD_OPTS[:clean_text][:lb_is_space], sps_style: DEF_METHOD_OPTS[:clean_text][:sps_style], delete_asian_space: DEF_METHOD_OPTS[:clean_text][:delete_asian_space], linehead_style: DEF_METHOD_OPTS[:clean_text][:linehead_style], linetail_style: DEF_METHOD_OPTS[:clean_text][:linetail_style], firstlbs_style: DEF_METHOD_OPTS[:clean_text][:firstlbs_style], lastsps_style: DEF_METHOD_OPTS[:clean_text][:lastsps_style], lb: DEF_METHOD_OPTS[:clean_text][:lb], lb_out: DEF_METHOD_OPTS[:clean_text][:lb_out], # If unspecified, will be replaced with lb is_debug: false ) #isdebug = true if prt == "foo\n\n\nbar\n" lb_out ||= lb # Output linebreak boundary_style = lb_out*2 if true == boundary_style boundary_style = "" if [:delete, :d].include? boundary_style lastsps_style = lb_out if :linebreak == lastsps_style if !prt.class.method_defined? :last_significant_element # Construct a Part instance from the given String. ret = '' begin prt = prt.unicode_normalize rescue ArgumentError # (invalid byte sequence in UTF-8) warn "The given String in (#{self.name}\##{__method__}) seems wrong." raise end prt = normalize_lb(prt, "\n", lb_from: (DefLineBreaks.include?(lb) ? nil : lb)).dup kwd = (["\r\n", "\r", "\n"].include?(lb) ? {} : { rules: /#{Regexp.quote lb}{2,}/}) prt = (preserve_paragraph ? Part.parse(prt, **kwd) : Part.new([prt])) else # If not preserve_paragraph, reconstructs it as a Part with a single Paragraph. # Also, deepcopy is needed, as this method is destructive. prt = (preserve_paragraph ? prt : Part.new([prt.join])).deepcopy end prt.squash_boundaries! # Boundaries are squashed. # Handles Boundary clean_text_boundary!(prt, boundary_style: boundary_style) # Handles linebreaks and spaces (within Paragraphs) clean_text_lbs_sps!( prt, lbs_style: lbs_style, lb_is_space: lb_is_space, sps_style: sps_style, delete_asian_space: delete_asian_space, is_debug: is_debug ) # Handles the line head/tails. clean_text_line_head_tail!( prt, linehead_style: linehead_style, linetail_style: linetail_style ) # Handles the file head/tail. clean_text_file_head_tail!( prt, firstlbs_style: firstlbs_style, lastsps_style: lastsps_style, is_debug: is_debug ) # Replaces the linebreaks to the specified one prt.map{ |i| i.gsub!(/\n/m, lb_out) } (ret ? prt.join : prt) # prt.to_s may be different from prt.join end |
.count_char(instr, *rest, lbs_style: , linehead_style: , lastsps_style: , lb_out: , **k) ⇒ Integer
Count the number of characters
See #clean_text! for the optional parameters. The defaults of a few of the optional parameters are different from it, such as the default for lb_out
is “n” (newline, so that a line-break is 1 byte in size). It is so that this method is more optimized for East-Asian (CJK) characters, given this method is most useful for CJK Strings, whereas, for European alphabets, counting the number of words, rather than characters as in this method, would be more standard.
96 97 98 99 100 101 102 103 104 |
# File 'lib/plain_text.rb', line 96 def self.count_char(instr, *rest, lbs_style: DEF_METHOD_OPTS[:count_char][:lbs_style], linehead_style: DEF_METHOD_OPTS[:count_char][:linehead_style], lastsps_style: DEF_METHOD_OPTS[:count_char][:lastsps_style], lb_out: DEF_METHOD_OPTS[:count_char][:lb_out], **k ) clean_text(instr, *rest, lbs_style: lbs_style, linehead_style: linehead_style, lastsps_style: lastsps_style, lb_out: lb_out, **k).size end |
.delete_spaces_bw_cjk_european(instr, *rest) ⇒ Object
Module function of #delete_spaces_bw_cjk_european
229 230 231 |
# File 'lib/plain_text.rb', line 229 def self.delete_spaces_bw_cjk_european(instr, *rest) __call_inst_method__(:delete_spaces_bw_cjk_european, instr, *rest) end |
.extend_this(obj) ⇒ TrueClass, NilClass
If the class of the obj does not “include” this module, do so in the singular class.
80 81 82 83 84 |
# File 'lib/plain_text.rb', line 80 def self.extend_this(obj) return nil if defined? obj.delete_spaces_bw_cjk_european! obj.extend(PlainText) true end |
.head(instr, *rest, **k) ⇒ Object
241 242 243 |
# File 'lib/plain_text.rb', line 241 def self.head(instr, *rest, **k) return PlainText.__call_inst_method__(:head, instr, *rest, **k) end |
.head_inverse(instr, *rest, **k) ⇒ Object
Module function of #head_inverse
The return String includes PlainText as Singleton.
253 254 255 |
# File 'lib/plain_text.rb', line 253 def self.head_inverse(instr, *rest, **k) return PlainText.__call_inst_method__(:head_inverse, instr, *rest, **k) end |
.normalize_lb(instr, *rest, **k) ⇒ Object
Module function of #normalize_lb
The return String includes PlainText as Singleton.
265 266 267 |
# File 'lib/plain_text.rb', line 265 def self.normalize_lb(instr, *rest, **k) return PlainText.__call_inst_method__(:normalize_lb, instr, *rest, **k) end |
.tail(instr, *rest, **k) ⇒ Object
277 278 279 |
# File 'lib/plain_text.rb', line 277 def self.tail(instr, *rest, **k) return PlainText.__call_inst_method__(:tail, instr, *rest, **k) end |
.tail_inverse(instr, *rest, **k) ⇒ Object
Module function of #tail_inverse
The return String includes PlainText as Singleton.
289 290 291 |
# File 'lib/plain_text.rb', line 289 def self.tail_inverse(instr, *rest, **k) return PlainText.__call_inst_method__(:tail_inverse, instr, *rest, **k) end |
Instance Method Details
#count_char(*rest, **k) ⇒ Integer
Count the number of characters
See count_char and further clean_text! for the optional parameters. The defaults of a few of the optional parameters are different from the latter, such as the default for lb_out
is “n” (newline, so that a line-break is 1 byte in size). It is so that this method is more optimized for East-Asian (CJK) characters, given this method is most useful for CJK Strings, whereas, for European alphabets, counting the number of words, rather than characters as in this method, would be more standard.
540 541 542 |
# File 'lib/plain_text.rb', line 540 def count_char(*rest, **k) PlainText.public_send(__method__, self, *rest, **k) end |
#delete_spaces_bw_cjk_european(*rest) ⇒ Object
Non-destructive version of #delete_spaces_bw_cjk_european!
563 564 565 566 567 |
# File 'lib/plain_text.rb', line 563 def delete_spaces_bw_cjk_european(*rest) newself = clone newself.delete_spaces_bw_cjk_european!(*rest) newself end |
#delete_spaces_bw_cjk_european!(repl = "") ⇒ MatchData, NilClass
Delete all the spaces between CJK and European characters or numbers.
All the spaces between CJK and European characters, numbers or punctuations are deleted or converted into a specified replacement character. Or, in short, any spaces between, before, and after a CJK characters are deleted. If the return is non-nil, there is at least one match.
553 554 555 556 |
# File 'lib/plain_text.rb', line 553 def delete_spaces_bw_cjk_european!(repl="") ret = gsub!(/(\p{Hiragana}|\p{Katakana}|[ー-]|[一-龠々]|\p{Han}|\p{Hangul})([[:blank:]]+)([[:upper:][:lower:][:digit:][:punct:]])/, '\1\3') ret ||= gsub!(/([[:upper:][:lower:][:digit:][:punct:]])([[:blank:]]+)(\p{Hiragana}|\p{Katakana}|[ー-]|[一-龠々]|\p{Han}|\p{Hangul})/, '\1\3') end |
#head(num_in = DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) ⇒ String
Returns the first num lines (or characters, bytes) or before the last n-th line.
If “byte” is specified as the return unit, the encoding is the same as self, though the encoding for the returned String may not be valid anymore. Note that it is probably the better practice to use string[ 0..5 ] and string#byteslice(0,5) instead of this method for the units of “char” and “byte”, respectively.
For num, a negative number means counting from the last (e.g., -1 (lines, if unit is :line) means everything but the last 1 line, and -5 means everything but the last 5 lines), whereas 0 is forbidden. If a too big negative number is given, such as -9 for String of 2 lines, a null string is returned.
If unit is :line, num can be Regexp, in which case the string of the lines up to the first line that matches the given Regexp is returned, where the process is based on the lines. For example, if num is /ABC/
(Regexp), String of the lines from the beginning up to the line that contains the character “ABC” is returned.
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 |
# File 'lib/plain_text.rb', line 598 def head(num_in=DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) if num_in.class.method_defined? :to_int num = num_in.to_int raise ArgumentError, "Non-positive num (#{num_in}) is given in #{__method__}" if num.to_int < 1 elsif num_in.class.method_defined? :named_captures re_in = num_in else raise raise_typeerror(num_in, 'Integer or Range') end case unit when :line, "-n" # Regexp (for boundary) return head_regexp(re_in, inclusive: inclusive, padding: padding, linebreak: linebreak) if re_in # Integer (a number of lines) ret = split(linebreak, -1)[0..(num-1)].join(linebreak) # -1 is specified to preserve the last linebreak(s). return ret if size <= ret.size # Specified line is larger than the original or the last NL is missing. return(ret << linebreak) # NL is added to the tail as in the original. when :char return self[0..(num-1)] when :byte, "-c" return self.byteslice(0..(num-1)) else raise ArgumentError, "Specified unit (#{unit}.inspect) is invalid in #{__method__}" end end |
#head!(*rest, **key) ⇒ self
Destructive version of #head
574 575 576 |
# File 'lib/plain_text.rb', line 574 def head!(*rest, **key) replace(head(*rest, **key)) end |
#head_inverse(*rest, **key) ⇒ Object
Inverse of head - returns the content except for the first num lines (or characters, bytes)
639 640 641 642 |
# File 'lib/plain_text.rb', line 639 def head_inverse(*rest, **key) s2 = head(*rest, **key) (s2.size >= size) ? self[0,0] : self[s2.size..-1] end |
#head_inverse!(*rest, **key) ⇒ self
Destructive version of #head_inverse
631 632 633 |
# File 'lib/plain_text.rb', line 631 def head_inverse!(*rest, **key) replace(head_inverse(*rest, **key)) end |
#normalize_lb(*rest, **k) ⇒ Object
Non-destructive version of #normalize_lb!
668 669 670 671 672 |
# File 'lib/plain_text.rb', line 668 def normalize_lb(*rest, **k) newself = clone # must be clone (not dup) so Singlton methods, which may include this method, must be included. newself.normalize_lb!(*rest, **k) newself end |
#normalize_lb!(repl = $/, lb_from: nil) ⇒ MatchData, NilClass
Normalizes line-breaks
All the line-breaks of self are converted into a new character or n If the return is non-nil, self contains unexpected line-break characters for the OS.
653 654 655 656 657 658 659 660 661 662 |
# File 'lib/plain_text.rb', line 653 def normalize_lb!(repl=$/, lb_from: nil) ret = nil lb_from ||= DefLineBreaks lb_from = [lb_from].flatten lb_from.each do |ea_lb| gsub!(/#{ea_lb}/, repl) if ($/ != ea_lb) || ($/ == ea_lb && repl != ea_lb) ret = $~ if ($/ != ea_lb) && !ret end ret end |
#strip_at_lines(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines!
693 694 695 696 697 |
# File 'lib/plain_text.rb', line 693 def strip_at_lines(*rest, **k) newself = clone # must be clone (not dup) so Singlton methods, which may include this method, must be included. newself.strip_at_lines!(*rest, **k) newself end |
#strip_at_lines!(strip_head: true, strip_tail: true, markdown: false, linebreak: $/) ⇒ self, NilClass
String#strip! for each line
682 683 684 685 686 687 |
# File 'lib/plain_text.rb', line 682 def strip_at_lines!(strip_head: true, strip_tail: true, markdown: false, linebreak: $/) strip_head = false if markdown r1 = strip_at_lines_head!( linebreak: linebreak) if strip_head r2 = strip_at_lines_tail!(markdown: markdown, linebreak: linebreak) if strip_tail (r1 || r2) ? self : nil end |
#strip_at_lines_head(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines_head!
713 714 715 716 717 |
# File 'lib/plain_text.rb', line 713 def strip_at_lines_head(*rest, **k) newself = clone # must be clone (not dup) so Singlton methods, which may include this method, must be included. newself.strip_at_lines_head!(*rest, **k) newself end |
#strip_at_lines_head!(linebreak: $/) ⇒ self, NilClass
String#strip! for each line but only for the head part (NOT tail part)
704 705 706 707 |
# File 'lib/plain_text.rb', line 704 def strip_at_lines_head!(linebreak: $/) lb_quo = Regexp.quote linebreak gsub!(/(\A|#{lb_quo})[[:blank:]]+/m, '\1') end |
#strip_at_lines_tail(*rest, **k) ⇒ Object
Non-destructive version of #strip_at_lines_tail!
737 738 739 740 741 |
# File 'lib/plain_text.rb', line 737 def strip_at_lines_tail(*rest, **k) newself = clone # must be clone (not dup) so Singlton methods, which may include this method, must be included. newself.strip_at_lines_tail!(*rest, **k) newself end |
#strip_at_lines_tail!(markdown: false, linebreak: $/) ⇒ self, NilClass
String#strip! for each line but only for the tail part (NOT head part)
724 725 726 727 728 729 730 731 |
# File 'lib/plain_text.rb', line 724 def strip_at_lines_tail!(markdown: false, linebreak: $/) lb_quo = Regexp.quote linebreak return gsub!(/(?<=^|[^[:blank:]])[[:blank:]]+(#{lb_quo}|\z)/m, '\1') if ! markdown r1 = gsub!(/(?<=^|[^[:blank:]])[[:blank:]]{3,}(#{lb_quo}|\z)/m, '\1') r2 = gsub!(/(?<=^|[^[:blank:]])[[:blank:]](#{lb_quo}|\z)/m, '\1') (r1 || r2) ? self : nil end |
#tail(num_in = DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) ⇒ String
Returns the last num lines (or characters, bytes) or of and after the first n-th line.
If “byte” is specified as the return unit, the encoding is the same as self, though the encoding for the returned String may not be valid anymore. Note that it is probably the better practice to use string[ -5..-1 ] and string#byteslice(-5,5) instead of this method for the units of “char” and “byte”, respectively.
For num, a negative number means counting from the first (e.g., -1 [lines, if unit is :line] means everything but the first 1 line, and -5 means everything but the first 5 lines), whereas 0 is forbidden. If a too big negative number is given, such as -9 for String of 2 lines, a null string is returned.
If unit is :line, num can be Regexp, in which case the string of the lines after the first line that matches the given Regexp is returned (not inclusive), where the process is based on the lines. For example, if num is /ABC/, String of the lines from the next line of the first line that contains the character “ABC” till the last one is returned. “The next line” means (1) the line immediately after the match if the matched string has the linebreak at the end, or (2) the line after the first linebreak after the matched string, where the trailing characters after the matched string to the linebreak (inclusive) is ignored.
Tips =
To specify the last line that matches the Regexp, consider prefixing (?:.*) with the option m
, e.g., /(?:.*)ABC/m
Note for developers =
The line that matches with Regexp has to be exclusive. Because otherwise to specify the last line that matches would be impossible in principle. For example, to specify the last line that matches ABC
, the given regexp should be /(?:.*)ABC/m (see the above Tips); in this case, if this matched line was inclusive, *all the lines from Line 1* would be included, which is most likely not what the caller wants.
786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 |
# File 'lib/plain_text.rb', line 786 def tail(num_in=DEF_HEADTAIL_N_LINES, unit: :line, inclusive: true, padding: 0, linebreak: $/) if num_in.class.method_defined? :to_int num = num_in.to_int raise ArgumentError, "num of zero is given in #{__method__}" if num == 0 num += 1 if num < 0 elsif num_in.class.method_defined? :named_captures re_in = num_in else raise raise_typeerror(num_in, 'Integer or Range') end case unit when :line, '-n' # Regexp (for boundary) return tail_regexp(re_in, inclusive: inclusive, padding: padding, linebreak: linebreak) if re_in # Integer (a number of lines) return tail_linenum(num_in, num, linebreak: linebreak) when :char num = 0 if num >= size && num_in > 0 return self[(-num)..-1] when :byte, '-c' num = 0 if num >= bytesize && num_in > 0 return self.byteslice((-num)..-1) else raise ArgumentError, "Specified unit (#{unit}.inspect) is invalid in #{__method__}" end end |
#tail!(*rest, **key) ⇒ self
Destructive version of #tail
748 749 750 |
# File 'lib/plain_text.rb', line 748 def tail!(*rest, **key) replace(tail(*rest, **key)) end |
#tail_inverse(*rest, **key) ⇒ Object
Inverse of tail - returns the content except for the first num lines (or characters, bytes)
828 829 830 831 |
# File 'lib/plain_text.rb', line 828 def tail_inverse(*rest, **key) s2 = tail(*rest, **key) (s2.size >= size) ? self[0,0] : self[0..(size-s2.size-1)] end |
#tail_inverse!(*rest, **key) ⇒ self
Destructive version of #tail_inverse
820 821 822 |
# File 'lib/plain_text.rb', line 820 def tail_inverse!(*rest, **key) replace(tail_inverse(*rest, **key)) end |