Commit graph

17 commits

Author SHA1 Message Date
3770190db5 test clean 2014-08-10 20:10:00 -04:00
kortschak
7265e1d7a1 Use github.com/peterh/liner for REPL lines
This gives us history and line conveniences.
2014-08-07 15:02:30 +09:30
kortschak
6acfdcc5d6 Use concrete value for quad.Quad
Comparison of -short benchmarks in cayley.

$ benchcmp pointer.bench concrete.bench
benchmark                                   old ns/op     new ns/op	delta
BenchmarkNamePredicate                      1673276       1655093	-1.09%
BenchmarkLargeSetsNoIntersection            318985907     261499984	-18.02%
BenchmarkNetAndSpeed                        104403743     41516981	-60.23%
BenchmarkKeanuAndNet                        17309258      16857513	-2.61%
BenchmarkKeanuAndSpeed                      20159161      19282833	-4.35%

Comparison of pathological cases are not so happy.

benchmark                                   old ns/op       new ns/op		delta
BenchmarkVeryLargeSetsSmallIntersection     55269775527     246084606672	+345.24%
BenchmarkHelplessContainsChecker            23436501319     24308906949		+3.72%

Profiling the worst case:

Pointer:
Total: 6121 samples
    1973  32.2%  32.2%     1973  32.2% runtime.findfunc
     773  12.6%  44.9%      773  12.6% readvarint
     510   8.3%  53.2%      511   8.3% step
     409   6.7%  59.9%      410   6.7% runtime.gentraceback
     390   6.4%  66.2%      391   6.4% pcvalue
     215   3.5%  69.8%      215   3.5% runtime.funcdata
     181   3.0%  72.7%      181   3.0% checkframecopy
     118   1.9%  74.6%      119   1.9% runtime.funcspdelta
      96   1.6%  76.2%       96   1.6% runtime.topofstack
      76   1.2%  77.5%       76   1.2% scanblock

Concrete:
Total: 25027 samples
    9437  37.7%  37.7%     9437  37.7% runtime.findfunc
    3853  15.4%  53.1%     3853  15.4% readvarint
    2366   9.5%  62.6%     2366   9.5% step
    2186   8.7%  71.3%     2186   8.7% runtime.gentraceback
    1816   7.3%  78.5%     1816   7.3% pcvalue
    1016   4.1%  82.6%     1016   4.1% runtime.funcdata
     859   3.4%  86.0%      859   3.4% checkframecopy
     506   2.0%  88.1%      506   2.0% runtime.funcspdelta
     410   1.6%  89.7%      410   1.6% runtime.topofstack
     303   1.2%  90.9%      303   1.2% runtime.newstack
2014-08-05 23:25:02 +09:30
kortschak
ffb52af00b Rename GremlinTimeout -> Timeout
Given that there may be other Turing complete query interfaces
(particularly a Go query API), the timeout config should not be
specifically tied to gremlin.
2014-08-02 23:28:24 +09:30
kortschak
09943c3eb6 Move sexp into query 2014-07-31 09:36:43 +09:30
kortschak
a6cf432313 Move query interface definitions into query 2014-07-31 08:52:24 +09:30
kortschak
41f6d3fd84 Temporarily use cquads only
I intend to make this configurable, but there is tight connection
between db.Load and db.Open that is getting in the way of that.

Testing on data set 30kmoviedata.cq.gz created by doing:

zcat 30kmoviedata.nq.gz | sed 's/[<>]//g' | gzip -c > 30kmoviedata.cq.gz

The following query is successful:

[{
  "type": "/film/film",
  "name": null,
  "/film/film/directed_by": {
    "name": "David Fincher"
  },
  "/film/film/starring": [{
    "/film/performance/actor": {
      "name": null
    }
  }]
}]

TODO: fix up naming for quads and make strict parsing an option.
2014-07-28 21:56:32 +09:30
kortschak
401c58426f Create quads hierarchy
* Move nquads into quad.
* Create cquads simplified parser in quad.
* Move Triple (renamed Quad) to quad.

Also made sure mongo actually implements BulkLoader.
2014-07-28 21:36:22 +09:30
kortschak
d76213fb2d Handle comments in N-Quad documents and REPL
The parser rejects an N-Quad with a comment, so we filter those out
ahead of time. This simplifies the grammar and code generated by the
parser.
2014-07-25 11:22:24 +09:30
kortschak
0e0e382d2b Use error returns and interface type for parsing
Fixes issue #72

This change simplifies interactions with parsing N-Quads and makes
reading datasets more robust. Changes made while here also improve
performance:

benchmark           old ns/op     new ns/op     delta
BenchmarkParser     1058          667           -36.96%

We still use string concatenation which I'm not wildly happy about, but
I think this can be left for a later change.

Initial changes towards idiomatic error handling have been made. More
significant changes are needed, but these have subtle design implication
and need to be thought about more.

30kmoviesdata.nt.gz has been altered to properly escape double quotes.
This was done mechanically and with manual curation to pick up
straglers.
2014-07-22 20:34:37 +09:30
kortschak
177059cc16 Destutter nquads 2014-06-28 13:33:00 +09:30
kortschak
40f3363cde Destutter graph/... 2014-06-28 13:29:16 +09:30
kortschak
913d567ae1 Destutter mql 2014-06-28 12:58:03 +09:30
kortschak
3a673a333c Destutter gremlin 2014-06-28 12:55:21 +09:30
kortschak
388618bfa7 Destutter cayley/config 2014-06-28 12:42:15 +09:30
kortschak
c4a19a4e35 Simplify names in cmd source 2014-06-28 12:38:51 +09:30
kortschak
639559544d Reorganise to make cmd code more prominant 2014-06-28 02:14:09 +09:30