Pseudo Date

What is a Pseudo Date?

It's a date but not really. A PseudoDate object has a day, month, and year but it does not require all of them like the built in ruby date classes. This allows you to parse obscure date strings that may or may not be complete.

What Is This For?

PseudoDate was created to parse odd dates in odd formats and attempt to extract as much information from them as possible. It's especially handy when you're trying to convert a date string that has come from an OCR'd source. It was primarily written to parse dates in American public record data in an effort to have a common date format when doing record matching.

Assumptions

As with all parsing, one needs to make assumptions. The main assumption made here is that all dates will be in the past. Dates that appear to be far-future are generally labeled as "invalid."

Since this gem was built for trying to wrangle OCR'd dates we have to make some assumptions when it comes to date formats. As of 0.2.0 the gem now assumes that dates separated by a "/" are American dates and those that are separated by a "-" are European dates. Future versions may allow some configuration for this depending on your usage but in my experience there has not been a need for that. This mimics the behavior from Ruby 1.8.7 which was changed in Ruby 1.9+.

Other Notes

PseudoDate stores date attributes in strings instead of integers to avoid losing the preceding '0' on various attributes. This was a decision made when first creating the class because of the way things were being output in the project it was created for. There has been some discussion about switching these to integers in order to help save on memory but no decision has been made here either way.

Compatability

PseudoDates are not really compatible with other built-in date/time objects. They do support some of the basic methods for abstracting numbers though. PseudoDates that are of exact precision can be turned into ruby date objects.

>> p = PseudoDate.new('19850625')
 => #<PseudoDate:0x10190eff0 @month="06", @date_hash={:day=>"25", :month=>"06", :year=>"1985"}, year"1985", day"25"
>> p.precision
 => "exact" 
>> p.year
 => "1985" 
>> p.month
 => "06" 
>> p.day
 => "25"
>> p.to_date
 => #<Date: 4892483/2,0,2299161>

Examples

>> PseudoDate.new('19850625').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

>> PseudoDate.new('1985-25-06').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

>> PseudoDate.new('06-25-1985').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

>> PseudoDate.new('25-06-1985').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

>> PseudoDate.new('06/25/1985').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

>> PseudoDate.new('06/1985').to_hash
 => {:day=>"00", :month=>"06", :year=>"1985"}

>> PseudoDate.new('85').to_hash
 => {:day=>"00", :month=>"00", :year=>"1985"}

>> PseudoDate.new('1985').to_hash
 => {:day=>"00", :month=>"00", :year=>"1985"}

>> PseudoDate.new('Jun 25, 1985').to_hash
 => {:day=>"25", :month=>"06", :year=>"1985"}

Patches, Bugfixes, Additions...

Feel free to fork and add stuff as you see fit. This was written quickly to solve some problems so it could certainly use some more structure and organization.

If you add something or fix something, make sure you run the very basic tests and add any new ones as required by your changes.