Skip to content
This repository has been archived by the owner on Oct 8, 2019. It is now read-only.

DataChunk* should handle nil dates coming in from Cascalog #152

Closed
robinkraft opened this issue Jul 20, 2012 · 16 comments
Closed

DataChunk* should handle nil dates coming in from Cascalog #152

robinkraft opened this issue Jul 20, 2012 · 16 comments

Comments

@robinkraft
Copy link
Contributor

Currently, as seen in issue #71, the !date field cannot be nullable if we're doing thrift packing within a query. Yet the precondition seems to handle dates correctly. The workaround we're using in #141 simply removes the !date field completely from the query, but we should get a better handle on why we can't have a nullable date field.

This query illustrates the problem:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil]]]
                                     (<- [?dc]
                                         (src ?name ?loc ?val ?t-res !date)
                                         (thrift/DataChunk* ?name ?loc ?val ?t-res !date :> ?dc))))
Caused by: java.lang.AssertionError: Assert failed: (or (not date) (string? (first date)))
    at forma.thrift$DataChunk_STAR_.doInvoke(thrift.clj:289)
    at clojure.lang.RestFn.invoke(RestFn.java:497)
    at clojure.lang.Var.invoke(Var.java:431)
    at clojure.lang.AFn.applyToHelper(AFn.java:178)
    at clojure.lang.Var.applyTo(Var.java:532)
    at cascalog.ClojureCascadingBase.applyFunction(ClojureCascadingBase.java:68)
    at cascalog.ClojureMap.operate(ClojureMap.java:34)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)
@ghost ghost assigned eightysteele Jul 20, 2012
@eightysteele
Copy link
Contributor

Yeah my bad. The precondition on date is WRONG. Fixing now with a tug request.

@robinkraft
Copy link
Contributor Author

Is it? (or (not date) ,,,) should handle nil values just fine, no?

On Fri, Jul 20, 2012, at 09:29 AM, Aaron Steele wrote:

Yeah my bad. The precondition on date is WRONG. Fixing now with a tug request.


Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

Yeah but date is optional, which means you'll get an empty sequence which isn't nil.

@robinkraft
Copy link
Contributor Author

The fields coming in from Cascalog should be [[[1] [2] [3] [nil]]], no? So the check should be (or (not (first date)) ,,,) or something - so that we're pulling the nil out of the sequence. (first nil) returns nil, so it works outside the cascalog context.

On Fri, Jul 20, 2012, at 09:32 AM, Aaron Steele wrote:

Yeah but date is optional, which means you'll get an empty sequence which isn't nil.


Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

Yeah I'll shoot a pull in a mo.

@eightysteele
Copy link
Contributor

#154

@eightysteele
Copy link
Contributor

So now you're query works if you add the :date keyword:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil]]]
       (<- [?dc]
           (src ?name ?loc ?val ?t-res !date)
           (thrift/DataChunk* ?name ?loc ?val ?t-res :date !date :> ?dc))))

@robinkraft
Copy link
Contributor Author

Does this require changing other queries that use the date field?

On Jul 20, 2012, at 10:10 AM, Aaron Steele [email protected] wrote:

So now you're query works if you add the :date keyword:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil]]]
      (<- [?dc]
          (src ?name ?loc ?val ?t-res !date)
          (thrift/DataChunk* ?name ?loc ?val ?t-res :date !date :> ?dc))))

Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

Ha, whoops. Yeah, all queries that use thrift/DataChunk* need to be updated. I'll do that now and commit.

@robinkraft
Copy link
Contributor Author

Ok, please run the tests for all of those namespaces too so that we are sure this isn't breaking something. Is there an easier way that avoids breaking existing queries, or is this a better way to handle optional args?

On Jul 20, 2012, at 10:30 AM, Aaron Steele [email protected] wrote:

Ha, whoops. Yeah, all queries that use thrift/DataChunk* need to be updated. I'll do that now and commit.


Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

I think making optional args explicit via keywords is the way to go here. I mean, yeah, it's an API breaking change that requires some fixing on the caller side, but it's worth it IMO. Open to ideas though!

@robinkraft
Copy link
Contributor Author

Fixed in #154

@sritchie
Copy link
Contributor

What does (thrift/DataChunk* "ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil) return?

On Fri, Jul 20, 2012 at 9:18 AM, Robin Kraft <
[email protected]

wrote:

Currently, as seen in issue #71, the !date field cannot be nullable if
we're doing thrift packing within a query. Yet the precondition
seems to handle dates correctly. The workaround we're using in #141 simply
removes the !date field completely from the query, but we should get a
better handle on why we can't have a nullable date field.

This query illustrates the problem:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16"
nil]]]
                                     (<- [?dc]
                                         (src ?name ?loc ?val ?t-res !date)
                                         (thrift/DataChunk* ?name ?loc
?val ?t-res !date :> ?dc))))
Caused by: java.lang.AssertionError: Assert failed: (or (not date)
(string? (first date)))
        at forma.thrift$DataChunk_STAR_.doInvoke(thrift.clj:289)
        at clojure.lang.RestFn.invoke(RestFn.java:497)
        at clojure.lang.Var.invoke(Var.java:431)
        at clojure.lang.AFn.applyToHelper(AFn.java:178)
        at clojure.lang.Var.applyTo(Var.java:532)
        at
cascalog.ClojureCascadingBase.applyFunction(ClojureCascadingBase.java:68)
        at cascalog.ClojureMap.operate(ClojureMap.java:34)
        at
cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)

Reply to this email directly or view it on GitHub:
#152

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie09

(Too brief? Here's why! http://emailcharter.org)

@robinkraft
Copy link
Contributor Author

Throws this:

Assert failed: (or (not date) (string? (first date)))
  [Thrown class java.lang.AssertionError]

On Sat, Jul 21, 2012, at 02:52 PM, Sam Ritchie wrote:

What does (thrift/DataChunk* "ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil) return?

On Fri, Jul 20, 2012 at 9:18 AM, Robin Kraft <
[email protected]

wrote:

Currently, as seen in issue #71, the !date field cannot be nullable if
we're doing thrift packing within a query. Yet the precondition
seems to handle dates correctly. The workaround we're using in #141 simply
removes the !date field completely from the query, but we should get a
better handle on why we can't have a nullable date field.

This query illustrates the problem:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16"
nil]]]
                                     (<- [?dc]
                                         (src ?name ?loc ?val ?t-res !date)
                                         (thrift/DataChunk* ?name ?loc
?val ?t-res !date :> ?dc))))
Caused by: java.lang.AssertionError: Assert failed: (or (not date)
(string? (first date)))
        at forma.thrift$DataChunk_STAR_.doInvoke(thrift.clj:289)
        at clojure.lang.RestFn.invoke(RestFn.java:497)
        at clojure.lang.Var.invoke(Var.java:431)
        at clojure.lang.AFn.applyToHelper(AFn.java:178)
        at clojure.lang.Var.applyTo(Var.java:532)
        at
cascalog.ClojureCascadingBase.applyFunction(ClojureCascadingBase.java:68)
        at cascalog.ClojureMap.operate(ClojureMap.java:34)
        at
cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)

Reply to this email directly or view it on GitHub:
#152

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie09

(Too brief? Here's why! http://emailcharter.org)


Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

Nope. API now requires :date keyword so Sam's example won't make it to that
assert failure. See thrift tests. Sam, try it at the repl?
On Jul 21, 2012 3:01 PM, "Robin Kraft" <
[email protected]>
wrote:

Throws this:

Assert failed: (or (not date) (string? (first date)))
  [Thrown class java.lang.AssertionError]

On Sat, Jul 21, 2012, at 02:52 PM, Sam Ritchie wrote:

What does (thrift/DataChunk* "ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil) return?

On Fri, Jul 20, 2012 at 9:18 AM, Robin Kraft <
[email protected]

wrote:

Currently, as seen in issue #71, the !date field cannot be nullable
if
we're doing thrift packing within a query. Yet the [precondition](

https://github.com/reddmetrics/forma-clj/blob/062d74f037e2b65d88154727ef95ad6c46603434/src/clj/forma/thrift.clj#L279
)
seems to handle dates correctly. The workaround we're using in #141
simply
removes the !date field completely from the query, but we should get
a
better handle on why we can't have a nullable date field.

This query illustrates the problem:

(??- (let [src [["ndvi" (thrift/ModisPixelLocation* "500" 28 8 0 0) 1
"16"
nil]]]
                                     (<- [?dc]
                                         (src ?name ?loc ?val ?t-res
!date)
                                         (thrift/DataChunk* ?name ?loc
?val ?t-res !date :> ?dc))))
Caused by: java.lang.AssertionError: Assert failed: (or (not date)
(string? (first date)))
        at forma.thrift$DataChunk_STAR_.doInvoke(thrift.clj:289)
        at clojure.lang.RestFn.invoke(RestFn.java:497)
        at clojure.lang.Var.invoke(Var.java:431)
        at clojure.lang.AFn.applyToHelper(AFn.java:178)
        at clojure.lang.Var.applyTo(Var.java:532)
        at

cascalog.ClojureCascadingBase.applyFunction(ClojureCascadingBase.java:68)
        at cascalog.ClojureMap.operate(ClojureMap.java:34)
        at

cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)

Reply to this email directly or view it on GitHub:
#152

Sam Ritchie, Twitter Inc
703.662.1337
@sritchie09

(Too brief? Here's why! http://emailcharter.org)


Reply to this email directly or view it on GitHub:
#152 (comment)


Reply to this email directly or view it on GitHub:
#152 (comment)

@eightysteele
Copy link
Contributor

@sritchie I'm at a REPL now.

(DataChunk* "ndvi" (ModisPixelLocation* "500" 28 8 0 0) 1 "16" nil)

;; No value supplied for key: null
;;  [Thrown class java.lang.IllegalArgumentException]

With the new API, we're required to pass in a :date keyword:

(DataChunk* "ndvi" (ModisPixelLocation* "500" 28 8 0 0) 1 "16" :date nil)

forma.thrift> (DataChunk* "ndvi" (ModisPixelLocation* "500" 28 8 0 0) 1 "16" :date nil)
#<DataChunk DataChunk(dataset:ndvi, locationProperty:LocationProperty(property:<LocationPropertyValue pixelLocation:ModisPixelLocation(resolution:500, tileH:28, tileV:8, sample:0, line:0)>), chunkValue:<DataValue longVal:1>, temporalRes:16)>

All good?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants