Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Proper Type support to Excelx. #240

Merged

Conversation

stevendaniels
Copy link
Contributor

Add Proper Type support to Excelx.

This commit implements proper type support for Roo::Excelx. I've created several
classes that inherit from Cell that are based on the types of Excel cells. There are two
main advantages to splitting Cell into classes based on cell types. First, the cell's value
is consistent with it's type. Second, it is possible to implement a formatted_value, a
string value based on the cell's format. This is especially useful for dealing with number
formats like percents, or date formats.

Original Type Support

Previously cell values were calculated by Cell#type_cast_value, but the
methods did not always cast a value as expected (e.g. Integers were cast as Floats).

Additionally, cells used an excelx_value method which contained a string
representation of the cell's original value.

New Type Support

The new cell type classes keep excelx_value as an alias to cell_value. Cell
value is intended to be a method that can be used regardless of what kind of
spreadsheet is being used. The excelx_value for cells created from
SharedStrings has changed: it now returns the string (it previously returned the
index for the SharedString). I've marked excelx_value as deprecated.

The new cell classes also have a formatted_value method. This method returns a
String representation of the value. These formatted values are used when
exporting to csv and are based on either a standard excel format, or a custom
format.

Standard formats follow certain conventions. Date fields for days, months, and years are separated with hyphens or slashes ("-", /") (e.g. 01-JAN, 1/13/15). Time fields for hours, minutes, and seconds are separated with a colon (e.g. 12:45:01).

If a custom format follows those conventions, then the custom format will be
used for the a cell's formatted value. Otherwise, the formatted value will be in
the following format: 'YYYY-mm-dd HH:MM:SS' (e.g. "2015-07-10 20:33:15").

Finally, the original value method has been upgraded to better match the type
of the value. Cell::Numbers will return an Integer or Float depending what type
of number the value should be. Cell::Booleans will return a Boolean. (One
exception: Cell::Time, which will continue to return seconds since midnight.)

This commit is related to and should resolve the following issues: #21, #53,
#86, #133, #139.

Issues/Caveats:

While implementing this feature, I noticed a few potential issues:

  1. Should Date and DateTime be separate classes?
  2. Number Formatting:
  3. Excel Format 255 rows limit in Excel #48, i.e. ##0.0E+0 is probably incorrect. I need to see examples of
    this format before I can properly understand how to produce the formatted value.
  4. Numbers can be other colors besides Red. The following colors are possible
    when using custom number formatting. [BLACK], [BLUE], [CYAN], [GREEN], [MAGENTA],
    [RED], [WHITE], [YELLOW], [COLOR n]
  5. I'm not quite sure what happens when number field uses a comma instead of a dot ("," in place of ".")

@stevendaniels
Copy link
Contributor Author

As far as I can tell, the decrease in coverage is because several methods in Roo::Excel::Cell are no longer called.

I'm investigating the failure when using rubinius. [update: Tests using rbx-2.5.7 pass, not sure why rbx-2.5.8 would fail]

@pabloh
Copy link
Contributor

pabloh commented Jul 18, 2015

@simonoff, @stevendaniels
Seems like a big change, I think it would be a good idea to release roo 2.0.2 with the changes already at master, before merging this.

@stevendaniels
Copy link
Contributor Author

Good idea. One issue: The master branch had already merged a couple of pull requests (#232, #224, #219) that change the public API and merit a minor upgrade to 2.1.x. A 2.0.2 release should only include the fixes.

On Sat, Jul 18, 2015 at 8:30 AM, Pablo Herrero [email protected]
wrote:

@simonoff, @stevendaniels

Seems like a big change, I think it would be a good idea to release roo 2.0.2 with the changes already at master, before merging this.

Reply to this email directly or view it on GitHub:
#240 (comment)

@simonoff
Copy link
Member

+1 for 2.1.x

@pabloh
Copy link
Contributor

pabloh commented Jul 18, 2015

You are right about the version number, there were some API changes. I've sent this PR (#241) to streamline things a bit.

@simonoff
Copy link
Member

@pabloh can you move rbx in .travis.yml to allow failure section?

@pabloh
Copy link
Contributor

pabloh commented Jul 18, 2015

@simonoff, done, is running now...

@mbouchard
Copy link

Do you have an idea when this patch will be merged? I really need the related issue #238 to be fixed for a project I'm working on.

@stevendaniels
Copy link
Contributor Author

I'll do some rebasing and try merge the changes in in the next day or so. Ping me if it isn't done by the weekend.

I'd been spending some time trying to see if I could easily create a cell types for OpenOffice/LibreOffice. Unfortunately, those changes would require far too much work at this stage.

stevendaniels added a commit that referenced this pull request Aug 20, 2015
@stevendaniels stevendaniels merged commit 23d6244 into roo-rb:master Aug 20, 2015
@mbouchard
Copy link

Thank you! It's now working very well. Do you think you will publish a new version of the gem soon?

@stevendaniels
Copy link
Contributor Author

I'll try to get a new version out by the end of this weekend.

@mbouchard
Copy link

I have the same problem (issue #238) with the LibreOffice / OpenOffice.org formats (ods). Do you think it's easy to patch that format the same way you patched the Excel 2007 - 2013 formats (xlsx, xlsm) ?

@stevendaniels
Copy link
Contributor Author

@mbouchard

It looks like OpenOffice could be patched in a way similar to #238. Look at lib/roo/open_office.rb Line 414. Would appreciate a pull request for this one.

@thotmx
Copy link
Contributor

thotmx commented Oct 9, 2015

I've made the PR #258. It was a quick patch, but it works.
@mbouchard Please, take a look to these changes. I hope, it would work for you.
@stevendaniels Your suggestion showed me the way, but I'm very confused. I couldn't find the merged code from #238 in the master branch.

@stevendaniels
Copy link
Contributor Author

@thotmx #238 was never merged into master because #240 also implemented the same functionality and overhauled the way other types worked, too.

@thotmx
Copy link
Contributor

thotmx commented Oct 9, 2015

@stevendaniels Ok. I'll take a look to this code to understand how it's working.
Thanks.

@mbouchard
Copy link

@thotmx It's all working fine now. Thank you.

@thotmx
Copy link
Contributor

thotmx commented Oct 13, 2015

@mbouchard you're welcome. I'm glad it's working.

stevendaniels added a commit that referenced this pull request Oct 31, 2015
This commit serves three purposes.

1. Helps prepare Roo for the new interface mentioned in #240.
2. Helps clean up Roo::Base, which is doing too much.
3. Marked unnecessary public functions for future deprecation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants