Merge branch 'master' into benchmarks

This commit is contained in:
kortschak 2014-07-31 08:35:34 +09:30
commit 1a0dd13735
77 changed files with 14272 additions and 1476 deletions

View file

@ -1,6 +1,6 @@
{
"Arch": "amd64 386",
"Os": "linux darwin windows",
"ResourcesInclude": "README.md,static,templates,LICENSE,AUTHORS,CONTRIBUTORS,docs,cayley.cfg.example,30kmoviedata.nt.gz,testdata.nt",
"ResourcesInclude": "README.md,static,templates,LICENSE,AUTHORS,CONTRIBUTORS,docs,cayley.cfg.example,30kmoviedata.nq.gz,testdata.nq",
"ConfigVersion": "0.9"
}

BIN
30kmoviedata.nq.gz Normal file

Binary file not shown.

Binary file not shown.

View file

@ -72,13 +72,13 @@ cayley> graph.Vertex("dani").Out("follows").All()
For somewhat more interesting data, a sample of 30k movies from Freebase comes in the checkout.
```
./cayley repl --dbpath=30kmoviedata.nt.gz
./cayley repl --dbpath=30kmoviedata.nq.gz
```
To run the web frontend, replace the "repl" command with "http"
```
./cayley http --dbpath=30kmoviedata.nt.gz
./cayley http --dbpath=30kmoviedata.nq.gz
```
And visit port 64210 on your machine, commonly [http://localhost:64210](http://localhost:64210)
@ -90,13 +90,13 @@ The default environment is based on [Gremlin](http://gremlindocs.com/) and is si
You'll notice we have a special object, `graph` or `g`, which is how you can interact with the graph.
The simplest query is merely to return a single vertex. Using the 30kmovies.nt dataset from above, let's walk through some simple queries:
The simplest query is merely to return a single vertex. Using the 30kmoviedata.nq dataset from above, let's walk through some simple queries:
```javascript
// Query all vertices in the graph, limit to the first 5 vertices found.
graph.Vertex().GetLimit(5)
// Start with only one vertex, the literal name "Humphrey Bogart", and retreive all of them.
// Start with only one vertex, the literal name "Humphrey Bogart", and retrieve all of them.
graph.Vertex("Humphrey Bogart").All()
// `g` and `V` are synonyms for `graph` and `Vertex` respectively, as they are quite common.

16
TODO.md
View file

@ -26,11 +26,11 @@ Usually something that should be taken care of.
### Bootstraps
Start discussing bootstrap triples, things that make the database self-describing, if they exist (though they need not). Talk about sameAs and indexing and type systems and whatnot.
### Better surfacing of Provenance
### Better surfacing of Label
It exists, it's indexed, but it's basically useless right now
### Optimize HasA Iterator
There are some simple optimizations that can be done there. And was the first one to get right, this is the next one.
There are some simple optimizations that can be done there. And was the first one to get right, this is the next one.
A simple example is just to convert the HasA to a fixed (next them out) if the subiterator size is guessable and small.
### Gremlin features
@ -39,7 +39,7 @@ A simple example is just to convert the HasA to a fixed (next them out) if the s
A way to limit the number of subresults at a point, without even running the query. Essentially, much as GetLimit() does for the end, be able to do the same in between
#### "Up" and "Down" traversals
Getting to the predicates from a node, or the nodes from a predicate, or some odd combinations thereof. Ditto for provenance.
Getting to the predicates from a node, or the nodes from a predicate, or some odd combinations thereof. Ditto for label.
#### Value comparison
Expose the value-comparison iterator in the language
@ -66,7 +66,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
Hopefully easy now that the AppEngine shim exists. Questionably fast.
### Postgres Backend
It'd be nice to run on SQL as well. It's a big why not?
It'd be nice to run on SQL as well. It's a big why not?
#### Generalist layout
Notionally, this is a simple triple table with a number of indicies. Iterators and iterator optimization (ie, rewriting SQL queries) is the 'fun' part
#### "Short Schema" Layout?
@ -75,7 +75,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
### New Iterators
#### Predicate Iterator
Really, this is just the generalized value comparison iterator, across strings and dates and such.
Really, this is just the generalized value comparison iterator, across strings and dates and such.
## Longer Term (and fuzzy)
@ -83,7 +83,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
There's a whole body of work there, and a lot of interested researchers. They're the choir who already know the sermon of graph stores. Once ease-of-use gets people in the door, supporting extensions that make everyone happy seems like a win. And because we're query-language agnostic, it's a cleaner win. See also bootstrapping, which is the first goal toward this (eg, let's talk about sameAs, and index it appropriately.)
### Replication
Technically it works now if you piggyback on someone else's replication, but that's cheating. We speak HTTP, we can send triple sets over the wire to some other instance. Bonus points for a way to apply morphisms first -- massive graph on the backend, important graph on the frontend.
Technically it works now if you piggyback on someone else's replication, but that's cheating. We speak HTTP, we can send triple sets over the wire to some other instance. Bonus points for a way to apply morphisms first -- massive graph on the backend, important graph on the frontend.
### Related services
Eg, topic service, recon service -- whether in Cayley itself or as part of the greater project.
@ -102,6 +102,6 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
### All sorts of backends:
#### Git?
Can we access git in a meaningful fashion, giving a history and rollbacks to memory/flat files?
#### ElasticSearch
#### Cassandra
#### ElasticSearch
#### Cassandra
#### Redis

View file

@ -1,6 +1,6 @@
{
"database": "mem",
"db_path": "30k.nt",
"db_path": "30kmoviedata.nq.gz",
"read_only": true,
"load_size": 10000,
"gremlin_timeout": 10

View file

@ -25,7 +25,8 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/config"
"github.com/google/cayley/graph"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad"
"github.com/google/cayley/quad/cquads"
)
func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
@ -40,7 +41,7 @@ func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
glog.Fatalln(err)
}
dec := nquads.NewDecoder(r)
dec := cquads.NewDecoder(r)
bulker, canBulk := ts.(graph.BulkLoader)
if canBulk {
@ -56,7 +57,7 @@ func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
return err
}
block := make([]*graph.Triple, 0, cfg.LoadSize)
block := make([]*quad.Quad, 0, cfg.LoadSize)
for {
t, err := dec.Unmarshal()
if err != nil {

View file

@ -26,7 +26,7 @@ import (
"github.com/google/cayley/config"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/sexp"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad/cquads"
"github.com/google/cayley/query/gremlin"
"github.com/google/cayley/query/mql"
)
@ -99,6 +99,11 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
if len(line) == 0 {
continue
}
line = bytes.TrimSpace(line)
if len(line) == 0 || line[0] == '#' {
line = line[:0]
continue
}
if bytes.HasPrefix(line, []byte(":debug")) {
ses.ToggleDebug()
fmt.Println("Debug Toggled")
@ -107,7 +112,7 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
}
if bytes.HasPrefix(line, []byte(":a")) {
var tripleStmt = line[3:]
triple, err := nquads.Parse(string(tripleStmt))
triple, err := cquads.Parse(string(tripleStmt))
if triple == nil {
if err != nil {
fmt.Printf("not a valid triple: %v\n", err)
@ -121,7 +126,7 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
}
if bytes.HasPrefix(line, []byte(":d")) {
var tripleStmt = line[3:]
triple, err := nquads.Parse(string(tripleStmt))
triple, err := cquads.Parse(string(tripleStmt))
if triple == nil {
if err != nil {
fmt.Printf("not a valid triple: %v\n", err)

View file

@ -93,7 +93,7 @@ POST Body: JSON triples
"subject": "Subject Node",
"predicate": "Predicate Node",
"object": "Object node",
"provenance": "Provenance node" // Optional
"label": "Label node" // Optional
}] // More than one triple allowed.
```
@ -121,7 +121,7 @@ POST Body: JSON triples
"subject": "Subject Node",
"predicate": "Predicate Node",
"object": "Object node",
"provenance": "Provenance node" // Optional
"label": "Label node" // Optional
}] // More than one triple allowed.
```

View file

@ -28,13 +28,13 @@ You can repeat the `--db` and `--dbpath` flags from here forward instead of the
First we load the data.
```bash
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nt.gz
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nq.gz
```
And wait. It will load. If you'd like to watch it load, you can run
```bash
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nt.gz --alsologtostderr
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nq.gz --alsologtostderr
```
And watch the log output go by.

View file

@ -14,8 +14,7 @@
package graph
// Define the general iterator interface, as well as the Base iterator which all
// iterators can "inherit" from to get default iterator functionality.
// Define the general iterator interface.
import (
"strings"
@ -24,18 +23,46 @@ import (
"github.com/barakmich/glog"
)
type Tagger struct {
tags []string
fixedTags map[string]Value
}
// Adds a tag to the iterator.
func (t *Tagger) Add(tag string) {
t.tags = append(t.tags, tag)
}
func (t *Tagger) AddFixed(tag string, value Value) {
if t.fixedTags == nil {
t.fixedTags = make(map[string]Value)
}
t.fixedTags[tag] = value
}
// Returns the tags. The returned value must not be mutated.
func (t *Tagger) Tags() []string {
return t.tags
}
// Returns the fixed tags. The returned value must not be mutated.
func (t *Tagger) Fixed() map[string]Value {
return t.fixedTags
}
func (t *Tagger) CopyFrom(src Iterator) {
for _, tag := range src.Tagger().Tags() {
t.Add(tag)
}
for k, v := range src.Tagger().Fixed() {
t.AddFixed(k, v)
}
}
type Iterator interface {
// Tags are the way we handle results. By adding a tag to an iterator, we can
// "name" it, in a sense, and at each step of iteration, get a named result.
// TagResults() is therefore the handy way of walking an iterator tree and
// getting the named results.
//
// Tag Accessors.
AddTag(string)
Tags() []string
AddFixedTag(string, Value)
FixedTags() map[string]Value
CopyTagsFrom(Iterator)
Tagger() *Tagger
// Fills a tag-to-result-value map.
TagResults(map[string]Value)
@ -58,19 +85,10 @@ type Iterator interface {
// All of them should set iterator.Last to be the last returned value, to
// make results work.
//
// Next() advances the iterator and returns the next valid result. Returns
// (<value>, true) or (nil, false)
Next() (Value, bool)
// NextResult() advances iterators that may have more than one valid result,
// from the bottom up.
NextResult() bool
// Return whether this iterator is reliably nextable. Most iterators are.
// However, some iterators, like "not" are, by definition, the whole database
// except themselves. Next() on these is unproductive, if impossible.
CanNext() bool
// Check(), given a value, returns whether or not that value is within the set
// held by this iterator.
Check(Value) bool
@ -117,6 +135,25 @@ type Iterator interface {
UID() uint64
}
type Nexter interface {
// Next() advances the iterator and returns the next valid result. Returns
// (<value>, true) or (nil, false)
Next() (Value, bool)
Iterator
}
// Next is a convenience function that conditionally calls the Next method
// of an Iterator if it is a Nexter. If the Iterator is not a Nexter, Next
// return a nil Value and false.
func Next(it Iterator) (Value, bool) {
if n, ok := it.(Nexter); ok {
return n.Next()
}
glog.Errorln("Nexting an un-nextable iterator")
return nil, false
}
// FixedIterator wraps iterators that are modifiable by addition of fixed value sets.
type FixedIterator interface {
Iterator

View file

@ -31,19 +31,25 @@ import (
// An All iterator across a range of int64 values, from `max` to `min`.
type Int64 struct {
Base
uid uint64
tags graph.Tagger
max, min int64
at int64
result graph.Value
}
// Creates a new Int64 with the given range.
func NewInt64(min, max int64) *Int64 {
var all Int64
BaseInit(&all.Base)
all.max = max
all.min = min
all.at = min
return &all
return &Int64{
uid: NextUID(),
min: min,
max: max,
at: min,
}
}
func (it *Int64) UID() uint64 {
return it.uid
}
// Start back at the beginning
@ -55,13 +61,28 @@ func (it *Int64) Close() {}
func (it *Int64) Clone() graph.Iterator {
out := NewInt64(it.min, it.max)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
func (it *Int64) Tagger() *graph.Tagger {
return &it.tags
}
// Fill the map based on the tags assigned to this iterator.
func (it *Int64) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
// Prints the All iterator as just an "all".
func (it *Int64) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags: %v)", strings.Repeat(" ", indent), it.Type(), it.Tags())
return fmt.Sprintf("%s(%s tags: %v)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags())
}
// Next() on an Int64 all iterator is a simple incrementing counter.
@ -76,10 +97,28 @@ func (it *Int64) Next() (graph.Value, bool) {
if it.at > it.max {
it.at = -1
}
it.Last = val
it.result = val
return graph.NextLogOut(it, val, true)
}
// DEPRECATED
func (it *Int64) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Int64) Result() graph.Value {
return it.result
}
func (it *Int64) NextResult() bool {
return false
}
// No sub-iterators.
func (it *Int64) SubIterators() []graph.Iterator {
return nil
}
// The number of elements in an Int64 is the size of the range.
// The size is exact.
func (it *Int64) Size() (int64, bool) {
@ -93,7 +132,7 @@ func (it *Int64) Check(tsv graph.Value) bool {
graph.CheckLogIn(it, tsv)
v := tsv.(int64)
if it.min <= v && v <= it.max {
it.Last = v
it.result = v
return graph.CheckLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)

View file

@ -22,23 +22,28 @@ import (
"github.com/google/cayley/graph"
)
// The And iterator. Consists of a Base and a number of subiterators, the primary of which will
// The And iterator. Consists of a number of subiterators, the primary of which will
// be Next()ed if next is called.
type And struct {
Base
uid uint64
tags graph.Tagger
internalIterators []graph.Iterator
itCount int
primaryIt graph.Iterator
checkList []graph.Iterator
result graph.Value
}
// Creates a new And iterator.
func NewAnd() *And {
var and And
BaseInit(&and.Base)
and.internalIterators = make([]graph.Iterator, 0, 20)
and.checkList = nil
return &and
return &And{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
}
}
func (it *And) UID() uint64 {
return it.uid
}
// Reset all internal iterators
@ -50,10 +55,33 @@ func (it *And) Reset() {
it.checkList = nil
}
func (it *And) Tagger() *graph.Tagger {
return &it.tags
}
// An extended TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *And) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
if it.primaryIt != nil {
it.primaryIt.TagResults(dst)
}
for _, sub := range it.internalIterators {
sub.TagResults(dst)
}
}
func (it *And) Clone() graph.Iterator {
and := NewAnd()
and.AddSubIterator(it.primaryIt.Clone())
and.CopyTagsFrom(it)
and.tags.CopyFrom(it)
for _, sub := range it.internalIterators {
and.AddSubIterator(sub.Clone())
}
@ -71,18 +99,6 @@ func (it *And) SubIterators() []graph.Iterator {
return iters
}
// Overrides Base TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *And) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
if it.primaryIt != nil {
it.primaryIt.TagResults(dst)
}
for _, sub := range it.internalIterators {
sub.TagResults(dst)
}
}
// DEPRECATED Returns the ResultTree for this iterator, recurses to it's subiterators.
func (it *And) ResultTree() *graph.ResultTree {
tree := graph.NewResultTree(it.Result())
@ -101,7 +117,7 @@ func (it *And) DebugString(indent int) string {
total += fmt.Sprintf("%d:\n%s\n", i, sub.DebugString(indent+4))
}
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
spaces := strings.Repeat(" ", indent+2)
@ -144,16 +160,20 @@ func (it *And) Next() (graph.Value, bool) {
var curr graph.Value
var exists bool
for {
curr, exists = it.primaryIt.Next()
curr, exists = graph.Next(it.primaryIt)
if !exists {
return graph.NextLogOut(it, nil, false)
}
if it.checkSubIts(curr) {
it.Last = curr
it.result = curr
return graph.NextLogOut(it, curr, true)
}
}
panic("Somehow broke out of Next() loop in And")
panic("unreachable")
}
func (it *And) Result() graph.Value {
return it.result
}
// Checks a value against the non-primary iterators, in order.
@ -177,7 +197,7 @@ func (it *And) checkCheckList(val graph.Value) bool {
}
}
if ok {
it.Last = val
it.result = val
}
return graph.CheckLogOut(it, val, ok)
}
@ -196,7 +216,7 @@ func (it *And) Check(val graph.Value) bool {
if !othersGood {
return graph.CheckLogOut(it, val, false)
}
it.Last = val
it.result = val
return graph.CheckLogOut(it, val, true)
}

View file

@ -82,7 +82,7 @@ func (it *And) Optimize() (graph.Iterator, bool) {
}
// Move the tags hanging on us (like any good replacement).
newAnd.CopyTagsFrom(it)
newAnd.tags.CopyFrom(it)
newAnd.optimizeCheck()
@ -145,14 +145,14 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
// all of it's contents, and to Check() each of those against everyone
// else.
for _, it := range its {
if !it.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
bad = append(bad, it)
continue
}
rootStats := it.Stats()
cost := rootStats.NextCost
for _, f := range its {
if !f.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
continue
}
if f == it {
@ -177,7 +177,7 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
// ... push everyone else after...
for _, it := range its {
if !it.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
continue
}
if it != best {
@ -213,11 +213,11 @@ func (it *And) optimizeCheck() {
func (it *And) getSubTags() map[string]struct{} {
tags := make(map[string]struct{})
for _, sub := range it.SubIterators() {
for _, tag := range sub.Tags() {
for _, tag := range sub.Tagger().Tags() {
tags[tag] = struct{}{}
}
}
for _, tag := range it.Tags() {
for _, tag := range it.tags.Tags() {
tags[tag] = struct{}{}
}
return tags
@ -227,13 +227,14 @@ func (it *And) getSubTags() map[string]struct{} {
// src itself, and moves them to dst.
func moveTagsTo(dst graph.Iterator, src *And) {
tags := src.getSubTags()
for _, tag := range dst.Tags() {
for _, tag := range dst.Tagger().Tags() {
if _, ok := tags[tag]; ok {
delete(tags, tag)
}
}
dt := dst.Tagger()
for k := range tags {
dst.AddTag(k)
dt.Add(k)
}
}

View file

@ -32,9 +32,9 @@ func TestIteratorPromotion(t *testing.T) {
a := NewAnd()
a.AddSubIterator(all)
a.AddSubIterator(fixed)
all.AddTag("a")
fixed.AddTag("b")
a.AddTag("c")
all.Tagger().Add("a")
fixed.Tagger().Add("b")
a.Tagger().Add("c")
newIt, changed := a.Optimize()
if !changed {
t.Error("Iterator didn't optimize")
@ -43,7 +43,7 @@ func TestIteratorPromotion(t *testing.T) {
t.Error("Expected fixed iterator")
}
tagsExpected := []string{"a", "b", "c"}
tags := newIt.Tags()
tags := newIt.Tagger().Tags()
sort.Strings(tags)
if !reflect.DeepEqual(tags, tagsExpected) {
t.Fatal("Tags don't match")
@ -67,9 +67,9 @@ func TestNullIteratorAnd(t *testing.T) {
func TestReorderWithTag(t *testing.T) {
all := NewInt64(100, 300)
all.AddTag("good")
all.Tagger().Add("good")
all2 := NewInt64(1, 30000)
all2.AddTag("slow")
all2.Tagger().Add("slow")
a := NewAnd()
// Make all2 the default iterator
a.AddSubIterator(all2)
@ -82,7 +82,7 @@ func TestReorderWithTag(t *testing.T) {
expectedTags := []string{"good", "slow"}
tagsOut := make([]string, 0)
for _, sub := range newIt.SubIterators() {
for _, x := range sub.Tags() {
for _, x := range sub.Tagger().Tags() {
tagsOut = append(tagsOut, x)
}
}
@ -93,9 +93,9 @@ func TestReorderWithTag(t *testing.T) {
func TestAndStatistics(t *testing.T) {
all := NewInt64(100, 300)
all.AddTag("good")
all.Tagger().Add("good")
all2 := NewInt64(1, 30000)
all2.AddTag("slow")
all2.Tagger().Add("slow")
a := NewAnd()
// Make all2 the default iterator
a.AddSubIterator(all2)

View file

@ -24,11 +24,11 @@ import (
func TestTag(t *testing.T) {
fix1 := newFixed()
fix1.Add(234)
fix1.AddTag("foo")
fix1.Tagger().Add("foo")
and := NewAnd()
and.AddSubIterator(fix1)
and.AddTag("bar")
out := fix1.Tags()
and.Tagger().Add("bar")
out := fix1.Tagger().Tags()
if len(out) != 1 {
t.Errorf("Expected length 1, got %d", len(out))
}

View file

@ -30,10 +30,12 @@ import (
// A Fixed iterator consists of it's values, an index (where it is in the process of Next()ing) and
// an equality function.
type Fixed struct {
Base
uid uint64
tags graph.Tagger
values []graph.Value
lastIndex int
cmp Equality
result graph.Value
}
// Define the signature of an equality function.
@ -54,12 +56,15 @@ func newFixed() *Fixed {
// Creates a new Fixed iterator with a custom comparitor.
func NewFixedIteratorWithCompare(compareFn Equality) *Fixed {
var it Fixed
BaseInit(&it.Base)
it.values = make([]graph.Value, 0, 20)
it.lastIndex = 0
it.cmp = compareFn
return &it
return &Fixed{
uid: NextUID(),
values: make([]graph.Value, 0, 20),
cmp: compareFn,
}
}
func (it *Fixed) UID() uint64 {
return it.uid
}
func (it *Fixed) Reset() {
@ -68,12 +73,26 @@ func (it *Fixed) Reset() {
func (it *Fixed) Close() {}
func (it *Fixed) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Fixed) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Fixed) Clone() graph.Iterator {
out := NewFixedIteratorWithCompare(it.cmp)
for _, val := range it.values {
out.Add(val)
}
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
@ -92,7 +111,7 @@ func (it *Fixed) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags: %s Size: %d id0: %d)",
strings.Repeat(" ", indent),
it.Type(),
it.FixedTags(),
it.tags.Fixed(),
len(it.values),
value,
)
@ -109,7 +128,7 @@ func (it *Fixed) Check(v graph.Value) bool {
graph.CheckLogIn(it, v)
for _, x := range it.values {
if it.cmp(x, v) {
it.Last = x
it.result = x
return graph.CheckLogOut(it, v, true)
}
}
@ -123,11 +142,29 @@ func (it *Fixed) Next() (graph.Value, bool) {
return graph.NextLogOut(it, nil, false)
}
out := it.values[it.lastIndex]
it.Last = out
it.result = out
it.lastIndex++
return graph.NextLogOut(it, out, true)
}
// DEPRECATED
func (it *Fixed) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Fixed) Result() graph.Value {
return it.result
}
func (it *Fixed) NextResult() bool {
return false
}
// No sub-iterators.
func (it *Fixed) SubIterators() []graph.Iterator {
return nil
}
// Optimize() for a Fixed iterator is simple. Returns a Null iterator if it's empty
// (so that other iterators upstream can treat this as null) or there is no
// optimization.

View file

@ -40,28 +40,35 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
// A HasA consists of a reference back to the graph.TripleStore that it references,
// a primary subiterator, a direction in which the triples for that subiterator point,
// and a temporary holder for the iterator generated on Check().
type HasA struct {
Base
uid uint64
tags graph.Tagger
ts graph.TripleStore
primaryIt graph.Iterator
dir graph.Direction
dir quad.Direction
resultIt graph.Iterator
result graph.Value
}
// Construct a new HasA iterator, given the triple subiterator, and the triple
// direction for which it stands.
func NewHasA(ts graph.TripleStore, subIt graph.Iterator, d graph.Direction) *HasA {
var hasa HasA
BaseInit(&hasa.Base)
hasa.ts = ts
hasa.primaryIt = subIt
hasa.dir = d
return &hasa
func NewHasA(ts graph.TripleStore, subIt graph.Iterator, d quad.Direction) *HasA {
return &HasA{
uid: NextUID(),
ts: ts,
primaryIt: subIt,
dir: d,
}
}
func (it *HasA) UID() uint64 {
return it.uid
}
// Return our sole subiterator.
@ -76,14 +83,18 @@ func (it *HasA) Reset() {
}
}
func (it *HasA) Tagger() *graph.Tagger {
return &it.tags
}
func (it *HasA) Clone() graph.Iterator {
out := NewHasA(it.ts, it.primaryIt.Clone(), it.dir)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Direction accessor.
func (it *HasA) Direction() graph.Direction { return it.dir }
func (it *HasA) Direction() quad.Direction { return it.dir }
// Pass the Optimize() call along to the subiterator. If it becomes Null,
// then the HasA becomes Null (there are no triples that have any directions).
@ -100,7 +111,14 @@ func (it *HasA) Optimize() (graph.Iterator, bool) {
// Pass the TagResults down the chain.
func (it *HasA) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.primaryIt.TagResults(dst)
}
@ -114,7 +132,7 @@ func (it *HasA) ResultTree() *graph.ResultTree {
// Print some information about this iterator.
func (it *HasA) DebugString(indent int) string {
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
return fmt.Sprintf("%s(%s %d tags:%s direction:%s\n%s)", strings.Repeat(" ", indent), it.Type(), it.UID(), tags, it.dir, it.primaryIt.DebugString(indent+4))
@ -141,15 +159,15 @@ func (it *HasA) Check(val graph.Value) bool {
// another match is made.
func (it *HasA) GetCheckResult() bool {
for {
linkVal, ok := it.resultIt.Next()
linkVal, ok := graph.Next(it.resultIt)
if !ok {
break
}
if glog.V(4) {
glog.V(4).Infoln("Triple is", it.ts.Triple(linkVal))
glog.V(4).Infoln("Quad is", it.ts.Quad(linkVal))
}
if it.primaryIt.Check(linkVal) {
it.Last = it.ts.TripleDirection(linkVal, it.dir)
it.result = it.ts.TripleDirection(linkVal, it.dir)
return true
}
}
@ -180,16 +198,20 @@ func (it *HasA) Next() (graph.Value, bool) {
}
it.resultIt = &Null{}
tID, ok := it.primaryIt.Next()
tID, ok := graph.Next(it.primaryIt)
if !ok {
return graph.NextLogOut(it, 0, false)
}
name := it.ts.Triple(tID).Get(it.dir)
name := it.ts.Quad(tID).Get(it.dir)
val := it.ts.ValueOf(name)
it.Last = val
it.result = val
return graph.NextLogOut(it, val, true)
}
func (it *HasA) Result() graph.Value {
return it.result
}
// GetStats() returns the statistics on the HasA iterator. This is curious. Next
// cost is easy, it's an extra call or so on top of the subiterator Next cost.
// CheckCost involves going to the graph.TripleStore, iterating out values, and hoping
@ -221,3 +243,7 @@ func (it *HasA) Close() {
// Register this iterator as a HasA.
func (it *HasA) Type() graph.Type { return graph.HasA }
func (it *HasA) Size() (int64, bool) {
return 0, true
}

View file

@ -14,16 +14,12 @@
package iterator
// Define the general iterator interface, as well as the Base which all
// iterators can "inherit" from to get default iterator functionality.
// Define the general iterator interface.
import (
"fmt"
"strings"
"sync/atomic"
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
)
@ -33,142 +29,40 @@ func NextUID() uint64 {
return atomic.AddUint64(&nextIteratorID, 1) - 1
}
// The Base iterator is the iterator other iterators inherit from to get some
// default functionality.
type Base struct {
Last graph.Value
tags []string
fixedTags map[string]graph.Value
canNext bool
uid uint64
}
// Called by subclases.
func BaseInit(it *Base) {
// Your basic iterator is nextable
it.canNext = true
if glog.V(2) {
it.uid = NextUID()
}
}
func (it *Base) UID() uint64 {
return it.uid
}
// Adds a tag to the iterator. Most iterators don't need to override.
func (it *Base) AddTag(tag string) {
if it.tags == nil {
it.tags = make([]string, 0)
}
it.tags = append(it.tags, tag)
}
func (it *Base) AddFixedTag(tag string, value graph.Value) {
if it.fixedTags == nil {
it.fixedTags = make(map[string]graph.Value)
}
it.fixedTags[tag] = value
}
// Returns the tags.
func (it *Base) Tags() []string {
return it.tags
}
func (it *Base) FixedTags() map[string]graph.Value {
return it.fixedTags
}
func (it *Base) CopyTagsFrom(other_it graph.Iterator) {
for _, tag := range other_it.Tags() {
it.AddTag(tag)
}
for k, v := range other_it.FixedTags() {
it.AddFixedTag(k, v)
}
}
// Prints a silly debug string. Most classes override.
func (it *Base) DebugString(indent int) string {
return fmt.Sprintf("%s(base)", strings.Repeat(" ", indent))
}
// Nothing in a base iterator.
func (it *Base) Check(v graph.Value) bool {
return false
}
// Base iterators should never appear in a tree if they are, select against
// them.
func (it *Base) Stats() graph.IteratorStats {
return graph.IteratorStats{100000, 100000, 100000}
}
// DEPRECATED
func (it *Base) ResultTree() *graph.ResultTree {
tree := graph.NewResultTree(it.Result())
return tree
}
// Nothing in a base iterator.
func (it *Base) Next() (graph.Value, bool) {
return nil, false
}
func (it *Base) NextResult() bool {
return false
}
// Returns the last result of an iterator.
func (it *Base) Result() graph.Value {
return it.Last
}
// If you're empty and you know it, clap your hands.
func (it *Base) Size() (int64, bool) {
return 0, true
}
// No subiterators. Only those with subiterators need to do anything here.
func (it *Base) SubIterators() []graph.Iterator {
return nil
}
// Accessor
func (it *Base) CanNext() bool { return it.canNext }
// Fill the map based on the tags assigned to this iterator. Default
// functionality works well for most iterators.
func (it *Base) TagResults(dst map[string]graph.Value) {
for _, tag := range it.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.FixedTags() {
dst[tag] = value
}
}
// Nothing to clean up.
// func (it *Base) Close() {}
func (it *Null) Close() {}
func (it *Base) Reset() {}
// Here we define the simplest base iterator -- the Null iterator. It contains nothing.
// Here we define the simplest iterator -- the Null iterator. It contains nothing.
// It is the empty set. Often times, queries that contain one of these match nothing,
// so it's important to give it a special iterator.
type Null struct {
Base
uid uint64
tags graph.Tagger
}
// Fairly useless New function.
func NewNull() *Null {
return &Null{}
return &Null{uid: NextUID()}
}
func (it *Null) UID() uint64 {
return it.uid
}
func (it *Null) Tagger() *graph.Tagger {
return &it.tags
}
// Fill the map based on the tags assigned to this iterator.
func (it *Null) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Null) Check(graph.Value) bool {
return false
}
func (it *Null) Clone() graph.Iterator { return NewNull() }
@ -185,6 +79,34 @@ func (it *Null) DebugString(indent int) string {
return strings.Repeat(" ", indent) + "(null)"
}
func (it *Null) Next() (graph.Value, bool) {
return nil, false
}
func (it *Null) Result() graph.Value {
return nil
}
func (it *Null) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Null) SubIterators() []graph.Iterator {
return nil
}
func (it *Null) NextResult() bool {
return false
}
func (it *Null) Size() (int64, bool) {
return 0, true
}
func (it *Null) Reset() {}
func (it *Null) Close() {}
// A null iterator costs nothing. Use it!
func (it *Null) Stats() graph.IteratorStats {
return graph.IteratorStats{}

View file

@ -34,29 +34,36 @@ import (
"strings"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
// A LinksTo has a reference back to the graph.TripleStore (to create the iterators
// for each node) the subiterator, and the direction the iterator comes from.
// `next_it` is the tempoarary iterator held per result in `primary_it`.
type LinksTo struct {
Base
uid uint64
tags graph.Tagger
ts graph.TripleStore
primaryIt graph.Iterator
dir graph.Direction
dir quad.Direction
nextIt graph.Iterator
result graph.Value
}
// Construct a new LinksTo iterator around a direction and a subiterator of
// nodes.
func NewLinksTo(ts graph.TripleStore, it graph.Iterator, d graph.Direction) *LinksTo {
var lto LinksTo
BaseInit(&lto.Base)
lto.ts = ts
lto.primaryIt = it
lto.dir = d
lto.nextIt = &Null{}
return &lto
func NewLinksTo(ts graph.TripleStore, it graph.Iterator, d quad.Direction) *LinksTo {
return &LinksTo{
uid: NextUID(),
ts: ts,
primaryIt: it,
dir: d,
nextIt: &Null{},
}
}
func (it *LinksTo) UID() uint64 {
return it.uid
}
func (it *LinksTo) Reset() {
@ -67,18 +74,29 @@ func (it *LinksTo) Reset() {
it.nextIt = &Null{}
}
func (it *LinksTo) Tagger() *graph.Tagger {
return &it.tags
}
func (it *LinksTo) Clone() graph.Iterator {
out := NewLinksTo(it.ts, it.primaryIt.Clone(), it.dir)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Return the direction under consideration.
func (it *LinksTo) Direction() graph.Direction { return it.dir }
func (it *LinksTo) Direction() quad.Direction { return it.dir }
// Tag these results, and our subiterator's results.
func (it *LinksTo) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.primaryIt.TagResults(dst)
}
@ -102,7 +120,7 @@ func (it *LinksTo) Check(val graph.Value) bool {
graph.CheckLogIn(it, val)
node := it.ts.TripleDirection(val, it.dir)
if it.primaryIt.Check(node) {
it.Last = val
it.result = val
return graph.CheckLogOut(it, val, true)
}
return graph.CheckLogOut(it, val, false)
@ -137,10 +155,10 @@ func (it *LinksTo) Optimize() (graph.Iterator, bool) {
// Next()ing a LinksTo operates as described above.
func (it *LinksTo) Next() (graph.Value, bool) {
graph.NextLogIn(it)
val, ok := it.nextIt.Next()
val, ok := graph.Next(it.nextIt)
if !ok {
// Subiterator is empty, get another one
candidate, ok := it.primaryIt.Next()
candidate, ok := graph.Next(it.primaryIt)
if !ok {
// We're out of nodes in our subiterator, so we're done as well.
return graph.NextLogOut(it, 0, false)
@ -150,10 +168,14 @@ func (it *LinksTo) Next() (graph.Value, bool) {
// Recurse -- return the first in the next set.
return it.Next()
}
it.Last = val
it.result = val
return graph.NextLogOut(it, val, ok)
}
func (it *LinksTo) Result() graph.Value {
return it.result
}
// Close our subiterators.
func (it *LinksTo) Close() {
it.nextIt.Close()
@ -181,3 +203,7 @@ func (it *LinksTo) Stats() graph.IteratorStats {
Size: fanoutFactor * subitStats.Size,
}
}
func (it *LinksTo) Size() (int64, bool) {
return 0, true
}

View file

@ -17,7 +17,7 @@ package iterator
import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
func TestLinksTo(t *testing.T) {
@ -32,12 +32,12 @@ func TestLinksTo(t *testing.T) {
t.Fatalf("Failed to return correct value, got:%v expect:1", val)
}
fixed.Add(val)
lto := NewLinksTo(ts, fixed, graph.Object)
lto := NewLinksTo(ts, fixed, quad.Object)
val, ok := lto.Next()
if !ok {
t.Error("At least one triple matches the fixed object")
}
if val != 2 {
t.Errorf("Triple index 2, such as %s, should match %s", ts.Triple(2), ts.Triple(val))
t.Errorf("Quad index 2, such as %s, should match %s", ts.Quad(2), ts.Quad(val))
}
}

View file

@ -17,15 +17,18 @@ package iterator
// A quickly mocked version of the TripleStore interface, for use in tests.
// Can better used Mock.Called but will fill in as needed.
import "github.com/google/cayley/graph"
import (
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
type store struct {
data []string
iter graph.Iterator
}
func (ts *store) ValueOf(s string) graph.Value {
for i, v := range ts.data {
func (qs *store) ValueOf(s string) graph.Value {
for i, v := range qs.data {
if s == v {
return i
}
@ -33,42 +36,42 @@ func (ts *store) ValueOf(s string) graph.Value {
return nil
}
func (ts *store) AddTriple(*graph.Triple) {}
func (qs *store) AddTriple(*quad.Quad) {}
func (ts *store) AddTripleSet([]*graph.Triple) {}
func (qs *store) AddTripleSet([]*quad.Quad) {}
func (ts *store) Triple(graph.Value) *graph.Triple { return &graph.Triple{} }
func (qs *store) Quad(graph.Value) *quad.Quad { return &quad.Quad{} }
func (ts *store) TripleIterator(d graph.Direction, i graph.Value) graph.Iterator {
return ts.iter
func (qs *store) TripleIterator(d quad.Direction, i graph.Value) graph.Iterator {
return qs.iter
}
func (ts *store) NodesAllIterator() graph.Iterator { return &Null{} }
func (qs *store) NodesAllIterator() graph.Iterator { return &Null{} }
func (ts *store) TriplesAllIterator() graph.Iterator { return &Null{} }
func (qs *store) TriplesAllIterator() graph.Iterator { return &Null{} }
func (ts *store) NameOf(v graph.Value) string {
func (qs *store) NameOf(v graph.Value) string {
i := v.(int)
if i < 0 || i >= len(ts.data) {
if i < 0 || i >= len(qs.data) {
return ""
}
return ts.data[i]
return qs.data[i]
}
func (ts *store) Size() int64 { return 0 }
func (qs *store) Size() int64 { return 0 }
func (ts *store) DebugPrint() {}
func (qs *store) DebugPrint() {}
func (ts *store) OptimizeIterator(it graph.Iterator) (graph.Iterator, bool) {
func (qs *store) OptimizeIterator(it graph.Iterator) (graph.Iterator, bool) {
return &Null{}, false
}
func (ts *store) FixedIterator() graph.FixedIterator {
func (qs *store) FixedIterator() graph.FixedIterator {
return NewFixedIteratorWithCompare(BasicEquality)
}
func (ts *store) Close() {}
func (qs *store) Close() {}
func (ts *store) TripleDirection(graph.Value, graph.Direction) graph.Value { return 0 }
func (qs *store) TripleDirection(graph.Value, quad.Direction) graph.Value { return 0 }
func (ts *store) RemoveTriple(t *graph.Triple) {}
func (qs *store) RemoveTriple(t *quad.Quad) {}

View file

@ -30,26 +30,31 @@ import (
"fmt"
"strings"
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
)
// An optional iterator has the subconstraint iterator we wish to be optional
// An optional iterator has the sub-constraint iterator we wish to be optional
// and whether the last check we received was true or false.
type Optional struct {
Base
uid uint64
tags graph.Tagger
subIt graph.Iterator
lastCheck bool
result graph.Value
}
// Creates a new optional iterator.
func NewOptional(it graph.Iterator) *Optional {
var o Optional
BaseInit(&o.Base)
o.canNext = false
o.subIt = it
return &o
return &Optional{
uid: NextUID(),
subIt: it,
}
}
func (it *Optional) CanNext() bool { return false }
func (it *Optional) UID() uint64 {
return it.uid
}
func (it *Optional) Reset() {
@ -61,17 +66,23 @@ func (it *Optional) Close() {
it.subIt.Close()
}
func (it *Optional) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Optional) Clone() graph.Iterator {
out := NewOptional(it.subIt.Clone())
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Nexting the iterator is unsupported -- error and return an empty set.
// (As above, a reasonable alternative would be to Next() an all iterator)
func (it *Optional) Next() (graph.Value, bool) {
glog.Errorln("Nexting an un-nextable iterator")
return nil, false
// DEPRECATED
func (it *Optional) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Optional) Result() graph.Value {
return it.result
}
// An optional iterator only has a next result if, (a) last time we checked
@ -84,13 +95,18 @@ func (it *Optional) NextResult() bool {
return false
}
// No subiterators.
func (it *Optional) SubIterators() []graph.Iterator {
return nil
}
// Check() is the real hack of this iterator. It always returns true, regardless
// of whether the subiterator matched. But we keep track of whether the subiterator
// matched for results purposes.
func (it *Optional) Check(val graph.Value) bool {
checked := it.subIt.Check(val)
it.lastCheck = checked
it.Last = val
it.result = val
return true
}
@ -111,7 +127,7 @@ func (it *Optional) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags:%s\n%s)",
strings.Repeat(" ", indent),
it.Type(),
it.Tags(),
it.tags.Tags(),
it.subIt.DebugString(indent+4))
}
@ -135,3 +151,8 @@ func (it *Optional) Stats() graph.IteratorStats {
Size: subStats.Size,
}
}
// If you're empty and you know it, clap your hands.
func (it *Optional) Size() (int64, bool) {
return 0, true
}

View file

@ -29,29 +29,34 @@ import (
)
type Or struct {
Base
uid uint64
tags graph.Tagger
isShortCircuiting bool
internalIterators []graph.Iterator
itCount int
currentIterator int
result graph.Value
}
func NewOr() *Or {
var or Or
BaseInit(&or.Base)
or.internalIterators = make([]graph.Iterator, 0, 20)
or.isShortCircuiting = false
or.currentIterator = -1
return &or
return &Or{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
currentIterator: -1,
}
}
func NewShortCircuitOr() *Or {
var or Or
BaseInit(&or.Base)
or.internalIterators = make([]graph.Iterator, 0, 20)
or.isShortCircuiting = true
or.currentIterator = -1
return &or
return &Or{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
isShortCircuiting: true,
currentIterator: -1,
}
}
func (it *Or) UID() uint64 {
return it.uid
}
// Reset all internal iterators
@ -62,6 +67,10 @@ func (it *Or) Reset() {
it.currentIterator = -1
}
func (it *Or) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Or) Clone() graph.Iterator {
var or *Or
if it.isShortCircuiting {
@ -72,7 +81,7 @@ func (it *Or) Clone() graph.Iterator {
for _, sub := range it.internalIterators {
or.AddSubIterator(sub.Clone())
}
or.CopyTagsFrom(it)
or.tags.CopyFrom(it)
return or
}
@ -84,7 +93,14 @@ func (it *Or) SubIterators() []graph.Iterator {
// Overrides BaseIterator TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *Or) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.internalIterators[it.currentIterator].TagResults(dst)
}
@ -105,7 +121,7 @@ func (it *Or) DebugString(indent int) string {
total += fmt.Sprintf("%d:\n%s\n", i, sub.DebugString(indent+4))
}
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
spaces := strings.Repeat(" ", indent+2)
@ -139,7 +155,7 @@ func (it *Or) Next() (graph.Value, bool) {
firstTime = true
}
curIt := it.internalIterators[it.currentIterator]
curr, exists = curIt.Next()
curr, exists = graph.Next(curIt)
if !exists {
if it.isShortCircuiting && !firstTime {
return graph.NextLogOut(it, nil, false)
@ -149,11 +165,15 @@ func (it *Or) Next() (graph.Value, bool) {
return graph.NextLogOut(it, nil, false)
}
} else {
it.Last = curr
it.result = curr
return graph.NextLogOut(it, curr, true)
}
}
panic("Somehow broke out of Next() loop in Or")
panic("unreachable")
}
func (it *Or) Result() graph.Value {
return it.result
}
// Checks a value against the iterators, in order.
@ -176,7 +196,7 @@ func (it *Or) Check(val graph.Value) bool {
if !anyGood {
return graph.CheckLogOut(it, val, false)
}
it.Last = val
it.result = val
return graph.CheckLogOut(it, val, true)
}
@ -247,7 +267,7 @@ func (it *Or) Optimize() (graph.Iterator, bool) {
}
// Move the tags hanging on us (like any good replacement).
newOr.CopyTagsFrom(it)
newOr.tags.CopyFrom(it)
// And close ourselves but not our subiterators -- some may still be alive in
// the new And (they were unchanged upon calling Optimize() on them, at the

View file

@ -24,7 +24,7 @@ import (
func iterated(it graph.Iterator) []int {
var res []int
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}

View file

@ -16,6 +16,7 @@ package iterator
import (
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
type Node struct {
@ -39,7 +40,7 @@ type queryShape struct {
ts graph.TripleStore
nodeId int
hasaIds []int
hasaDirs []graph.Direction
hasaDirs []quad.Direction
}
func OutputQueryShapeForIterator(it graph.Iterator, ts graph.TripleStore, outputMap map[string]interface{}) {
@ -62,11 +63,11 @@ func (qs *queryShape) AddLink(l *Link) {
qs.links = append(qs.links, *l)
}
func (qs *queryShape) LastHasa() (int, graph.Direction) {
func (qs *queryShape) LastHasa() (int, quad.Direction) {
return qs.hasaIds[len(qs.hasaIds)-1], qs.hasaDirs[len(qs.hasaDirs)-1]
}
func (qs *queryShape) PushHasa(i int, d graph.Direction) {
func (qs *queryShape) PushHasa(i int, d quad.Direction) {
qs.hasaIds = append(qs.hasaIds, i)
qs.hasaDirs = append(qs.hasaDirs, d)
}
@ -107,10 +108,10 @@ func (qs *queryShape) StealNode(left *Node, right *Node) {
func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
n := Node{Id: qs.nodeId}
for _, tag := range it.Tags() {
for _, tag := range it.Tagger().Tags() {
n.Tags = append(n.Tags, tag)
}
for k, _ := range it.FixedTags() {
for k, _ := range it.Tagger().Fixed() {
n.Tags = append(n.Tags, k)
}
@ -129,7 +130,7 @@ func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
case graph.Fixed:
n.IsFixed = true
for {
val, more := it.Next()
val, more := graph.Next(it)
if !more {
break
}
@ -159,10 +160,10 @@ func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
qs.nodeId++
newNode := qs.MakeNode(lto.primaryIt)
hasaID, hasaDir := qs.LastHasa()
if (hasaDir == graph.Subject && lto.dir == graph.Object) ||
(hasaDir == graph.Object && lto.dir == graph.Subject) {
if (hasaDir == quad.Subject && lto.dir == quad.Object) ||
(hasaDir == quad.Object && lto.dir == quad.Subject) {
qs.AddNode(newNode)
if hasaDir == graph.Subject {
if hasaDir == quad.Subject {
qs.AddLink(&Link{hasaID, newNode.Id, 0, n.Id})
} else {
qs.AddLink(&Link{newNode.Id, hasaID, 0, n.Id})

View file

@ -19,6 +19,7 @@ import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
func hasaWithTag(ts graph.TripleStore, tag string, target string) *HasA {
@ -26,14 +27,14 @@ func hasaWithTag(ts graph.TripleStore, tag string, target string) *HasA {
obj := ts.FixedIterator()
obj.Add(ts.ValueOf(target))
obj.AddTag(tag)
and.AddSubIterator(NewLinksTo(ts, obj, graph.Object))
obj.Tagger().Add(tag)
and.AddSubIterator(NewLinksTo(ts, obj, quad.Object))
pred := ts.FixedIterator()
pred.Add(ts.ValueOf("status"))
and.AddSubIterator(NewLinksTo(ts, pred, graph.Predicate))
and.AddSubIterator(NewLinksTo(ts, pred, quad.Predicate))
return NewHasA(ts, and, graph.Subject)
return NewHasA(ts, and, quad.Subject)
}
func TestQueryShape(t *testing.T) {
@ -48,7 +49,7 @@ func TestQueryShape(t *testing.T) {
// Given a single linkage iterator's shape.
hasa := hasaWithTag(ts, "tag", "cool")
hasa.AddTag("top")
hasa.Tagger().Add("top")
shape := make(map[string]interface{})
OutputQueryShapeForIterator(hasa, ts, shape)
@ -93,22 +94,22 @@ func TestQueryShape(t *testing.T) {
andInternal := NewAnd()
hasa1 := hasaWithTag(ts, "tag1", "cool")
hasa1.AddTag("hasa1")
hasa1.Tagger().Add("hasa1")
andInternal.AddSubIterator(hasa1)
hasa2 := hasaWithTag(ts, "tag2", "fun")
hasa2.AddTag("hasa2")
hasa2.Tagger().Add("hasa2")
andInternal.AddSubIterator(hasa2)
pred := ts.FixedIterator()
pred.Add(ts.ValueOf("name"))
and := NewAnd()
and.AddSubIterator(NewLinksTo(ts, andInternal, graph.Subject))
and.AddSubIterator(NewLinksTo(ts, pred, graph.Predicate))
and.AddSubIterator(NewLinksTo(ts, andInternal, quad.Subject))
and.AddSubIterator(NewLinksTo(ts, pred, quad.Predicate))
shape = make(map[string]interface{})
OutputQueryShapeForIterator(NewHasA(ts, and, graph.Object), ts, shape)
OutputQueryShapeForIterator(NewHasA(ts, and, quad.Object), ts, shape)
links = shape["links"].([]Link)
if len(links) != 3 {

View file

@ -17,7 +17,7 @@ package iterator
// "Value Comparison" is a unary operator -- a filter across the values in the
// relevant subiterator.
//
// This is hugely useful for things like provenance, but value ranges in general
// This is hugely useful for things like label, but value ranges in general
// come up from time to time. At *worst* we're as big as our underlying iterator.
// At best, we're the null iterator.
//
@ -46,21 +46,27 @@ const (
)
type Comparison struct {
Base
subIt graph.Iterator
op Operator
val interface{}
ts graph.TripleStore
uid uint64
tags graph.Tagger
subIt graph.Iterator
op Operator
val interface{}
ts graph.TripleStore
result graph.Value
}
func NewComparison(sub graph.Iterator, op Operator, val interface{}, ts graph.TripleStore) *Comparison {
var vc Comparison
BaseInit(&vc.Base)
vc.subIt = sub
vc.op = op
vc.val = val
vc.ts = ts
return &vc
return &Comparison{
uid: NextUID(),
subIt: sub,
op: op,
val: val,
ts: ts,
}
}
func (it *Comparison) UID() uint64 {
return it.uid
}
// Here's the non-boilerplate part of the ValueComparison iterator. Given a value
@ -111,9 +117,13 @@ func (it *Comparison) Reset() {
it.subIt.Reset()
}
func (it *Comparison) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Comparison) Clone() graph.Iterator {
out := NewComparison(it.subIt.Clone(), it.op, it.val, it.ts)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
@ -121,7 +131,7 @@ func (it *Comparison) Next() (graph.Value, bool) {
var val graph.Value
var ok bool
for {
val, ok = it.subIt.Next()
val, ok = graph.Next(it.subIt)
if !ok {
return nil, false
}
@ -129,10 +139,19 @@ func (it *Comparison) Next() (graph.Value, bool) {
break
}
}
it.Last = val
it.result = val
return val, ok
}
// DEPRECATED
func (it *Comparison) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Comparison) Result() graph.Value {
return it.result
}
func (it *Comparison) NextResult() bool {
for {
hasNext := it.subIt.NextResult()
@ -143,10 +162,15 @@ func (it *Comparison) NextResult() bool {
return true
}
}
it.Last = it.subIt.Result()
it.result = it.subIt.Result()
return true
}
// No subiterators.
func (it *Comparison) SubIterators() []graph.Iterator {
return nil
}
func (it *Comparison) Check(val graph.Value) bool {
if !it.doComparison(val) {
return false
@ -157,7 +181,14 @@ func (it *Comparison) Check(val graph.Value) bool {
// If we failed the check, then the subiterator should not contribute to the result
// set. Otherwise, go ahead and tag it.
func (it *Comparison) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.subIt.TagResults(dst)
}
@ -188,3 +219,7 @@ func (it *Comparison) Optimize() (graph.Iterator, bool) {
func (it *Comparison) Stats() graph.IteratorStats {
return it.subIt.Stats()
}
func (it *Comparison) Size() (int64, bool) {
return 0, true
}

View file

@ -24,36 +24,51 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type AllIterator struct {
iterator.Base
uid uint64
tags graph.Tagger
prefix []byte
dir graph.Direction
dir quad.Direction
open bool
iter ldbit.Iterator
ts *TripleStore
ro *opt.ReadOptions
result graph.Value
}
func NewAllIterator(prefix string, d graph.Direction, ts *TripleStore) *AllIterator {
var it AllIterator
iterator.BaseInit(&it.Base)
it.ro = &opt.ReadOptions{}
it.ro.DontFillCache = true
it.iter = ts.db.NewIterator(nil, it.ro)
it.prefix = []byte(prefix)
it.dir = d
it.open = true
it.ts = ts
func NewAllIterator(prefix string, d quad.Direction, ts *TripleStore) *AllIterator {
opts := &opt.ReadOptions{
DontFillCache: true,
}
it := AllIterator{
uid: iterator.NextUID(),
ro: opts,
iter: ts.db.NewIterator(nil, opts),
prefix: []byte(prefix),
dir: d,
open: true,
ts: ts,
}
it.iter.Seek(it.prefix)
if !it.iter.Valid() {
// FIXME(kortschak) What are the semantics here? Is this iterator usable?
// If not, we should return nil *Iterator and an error.
it.open = false
it.iter.Release()
}
return &it
}
func (it *AllIterator) UID() uint64 {
return it.uid
}
func (it *AllIterator) Reset() {
if !it.open {
it.iter = it.ts.db.NewIterator(nil, it.ro)
@ -66,15 +81,29 @@ func (it *AllIterator) Reset() {
}
}
func (it *AllIterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *AllIterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *AllIterator) Clone() graph.Iterator {
out := NewAllIterator(string(it.prefix), it.dir, it.ts)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
func (it *AllIterator) Next() (graph.Value, bool) {
if !it.open {
it.Last = nil
it.result = nil
return nil, false
}
var out []byte
@ -88,12 +117,29 @@ func (it *AllIterator) Next() (graph.Value, bool) {
it.Close()
return nil, false
}
it.Last = out
it.result = out
return out, true
}
func (it *AllIterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *AllIterator) Result() graph.Value {
return it.result
}
func (it *AllIterator) NextResult() bool {
return false
}
// No subiterators.
func (it *AllIterator) SubIterators() []graph.Iterator {
return nil
}
func (it *AllIterator) Check(v graph.Value) bool {
it.Last = v
it.result = v
return true
}
@ -115,7 +161,7 @@ func (it *AllIterator) Size() (int64, bool) {
func (it *AllIterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s tags: %v leveldb size:%d %s %p)", strings.Repeat(" ", indent), it.Type(), it.Tags(), size, it.dir, it)
return fmt.Sprintf("%s(%s tags: %v leveldb size:%d %s %p)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags(), size, it.dir, it)
}
func (it *AllIterator) Type() graph.Type { return graph.All }

View file

@ -24,45 +24,63 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type Iterator struct {
iterator.Base
uid uint64
tags graph.Tagger
nextPrefix []byte
checkId []byte
dir graph.Direction
dir quad.Direction
open bool
iter ldbit.Iterator
ts *TripleStore
qs *TripleStore
ro *opt.ReadOptions
originalPrefix string
result graph.Value
}
func NewIterator(prefix string, d graph.Direction, value graph.Value, ts *TripleStore) *Iterator {
var it Iterator
iterator.BaseInit(&it.Base)
it.checkId = value.([]byte)
it.dir = d
it.originalPrefix = prefix
it.nextPrefix = make([]byte, 0, 2+ts.hasher.Size())
it.nextPrefix = append(it.nextPrefix, []byte(prefix)...)
it.nextPrefix = append(it.nextPrefix, []byte(it.checkId[1:])...)
it.ro = &opt.ReadOptions{}
it.ro.DontFillCache = true
it.iter = ts.db.NewIterator(nil, it.ro)
it.open = true
it.ts = ts
func NewIterator(prefix string, d quad.Direction, value graph.Value, qs *TripleStore) *Iterator {
vb := value.([]byte)
p := make([]byte, 0, 2+qs.hasher.Size())
p = append(p, []byte(prefix)...)
p = append(p, []byte(vb[1:])...)
opts := &opt.ReadOptions{
DontFillCache: true,
}
it := Iterator{
uid: iterator.NextUID(),
nextPrefix: p,
checkId: vb,
dir: d,
originalPrefix: prefix,
ro: opts,
iter: qs.db.NewIterator(nil, opts),
open: true,
qs: qs,
}
ok := it.iter.Seek(it.nextPrefix)
if !ok {
// FIXME(kortschak) What are the semantics here? Is this iterator usable?
// If not, we should return nil *Iterator and an error.
it.open = false
it.iter.Release()
}
return &it
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
if !it.open {
it.iter = it.ts.db.NewIterator(nil, it.ro)
it.iter = it.qs.db.NewIterator(nil, it.ro)
it.open = true
}
ok := it.iter.Seek(it.nextPrefix)
@ -72,9 +90,23 @@ func (it *Iterator) Reset() {
}
}
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
out := NewIterator(it.originalPrefix, it.dir, it.checkId, it.ts)
out.CopyTagsFrom(it)
out := NewIterator(it.originalPrefix, it.dir, it.checkId, it.qs)
out.tags.CopyFrom(it)
return out
}
@ -87,22 +119,22 @@ func (it *Iterator) Close() {
func (it *Iterator) Next() (graph.Value, bool) {
if it.iter == nil {
it.Last = nil
it.result = nil
return nil, false
}
if !it.open {
it.Last = nil
it.result = nil
return nil, false
}
if !it.iter.Valid() {
it.Last = nil
it.result = nil
it.Close()
return nil, false
}
if bytes.HasPrefix(it.iter.Key(), it.nextPrefix) {
out := make([]byte, len(it.iter.Key()))
copy(out, it.iter.Key())
it.Last = out
it.result = out
ok := it.iter.Next()
if !ok {
it.Close()
@ -110,56 +142,73 @@ func (it *Iterator) Next() (graph.Value, bool) {
return out, true
}
it.Close()
it.Last = nil
it.result = nil
return nil, false
}
func PositionOf(prefix []byte, d graph.Direction, ts *TripleStore) int {
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func PositionOf(prefix []byte, d quad.Direction, qs *TripleStore) int {
if bytes.Equal(prefix, []byte("sp")) {
switch d {
case graph.Subject:
case quad.Subject:
return 2
case graph.Predicate:
return ts.hasher.Size() + 2
case graph.Object:
return 2*ts.hasher.Size() + 2
case graph.Provenance:
case quad.Predicate:
return qs.hasher.Size() + 2
case quad.Object:
return 2*qs.hasher.Size() + 2
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("po")) {
switch d {
case graph.Subject:
return 2*ts.hasher.Size() + 2
case graph.Predicate:
case quad.Subject:
return 2*qs.hasher.Size() + 2
case quad.Predicate:
return 2
case graph.Object:
return ts.hasher.Size() + 2
case graph.Provenance:
case quad.Object:
return qs.hasher.Size() + 2
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("os")) {
switch d {
case graph.Subject:
return ts.hasher.Size() + 2
case graph.Predicate:
return 2*ts.hasher.Size() + 2
case graph.Object:
case quad.Subject:
return qs.hasher.Size() + 2
case quad.Predicate:
return 2*qs.hasher.Size() + 2
case quad.Object:
return 2
case graph.Provenance:
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("cp")) {
switch d {
case graph.Subject:
return 2*ts.hasher.Size() + 2
case graph.Predicate:
return ts.hasher.Size() + 2
case graph.Object:
return 3*ts.hasher.Size() + 2
case graph.Provenance:
case quad.Subject:
return 2*qs.hasher.Size() + 2
case quad.Predicate:
return qs.hasher.Size() + 2
case quad.Object:
return 3*qs.hasher.Size() + 2
case quad.Label:
return 2
}
}
@ -171,14 +220,14 @@ func (it *Iterator) Check(v graph.Value) bool {
if val[0] == 'z' {
return false
}
offset := PositionOf(val[0:2], it.dir, it.ts)
offset := PositionOf(val[0:2], it.dir, it.qs)
if offset != -1 {
if bytes.HasPrefix(val[offset:], it.checkId[1:]) {
return true
}
} else {
nameForDir := it.ts.Triple(v).Get(it.dir)
hashForDir := it.ts.ValueOf(nameForDir).([]byte)
nameForDir := it.qs.Quad(v).Get(it.dir)
hashForDir := it.qs.ValueOf(nameForDir).([]byte)
if bytes.Equal(hashForDir, it.checkId) {
return true
}
@ -187,12 +236,12 @@ func (it *Iterator) Check(v graph.Value) bool {
}
func (it *Iterator) Size() (int64, bool) {
return it.ts.SizeOf(it.checkId), true
return it.qs.SizeOf(it.checkId), true
}
func (it *Iterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s %d tags: %v dir: %s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.UID(), it.Tags(), it.dir, size, it.ts.NameOf(it.checkId))
return fmt.Sprintf("%s(%s %d tags: %v dir: %s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.UID(), it.tags.Tags(), it.dir, size, it.qs.NameOf(it.checkId))
}
var levelDBType graph.Type

View file

@ -23,10 +23,11 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func makeTripleSet() []*graph.Triple {
tripleSet := []*graph.Triple{
func makeTripleSet() []*quad.Quad {
tripleSet := []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -42,20 +43,20 @@ func makeTripleSet() []*graph.Triple {
return tripleSet
}
func iteratedTriples(ts graph.TripleStore, it graph.Iterator) []*graph.Triple {
func iteratedTriples(qs graph.TripleStore, it graph.Iterator) []*quad.Quad {
var res ordered
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
res = append(res, ts.Triple(val))
res = append(res, qs.Quad(val))
}
sort.Sort(res)
return res
}
type ordered []*graph.Triple
type ordered []*quad.Quad
func (o ordered) Len() int { return len(o) }
func (o ordered) Less(i, j int) bool {
@ -72,7 +73,7 @@ func (o ordered) Less(i, j int) bool {
o[i].Subject == o[j].Subject &&
o[i].Predicate == o[j].Predicate &&
o[i].Object == o[j].Object &&
o[i].Provenance < o[j].Provenance:
o[i].Label < o[j].Label:
return true
@ -82,14 +83,14 @@ func (o ordered) Less(i, j int) bool {
}
func (o ordered) Swap(i, j int) { o[i], o[j] = o[j], o[i] }
func iteratedNames(ts graph.TripleStore, it graph.Iterator) []string {
func iteratedNames(qs graph.TripleStore, it graph.Iterator) []string {
var res []string
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
res = append(res, ts.NameOf(val))
res = append(res, qs.NameOf(val))
}
sort.Strings(res)
return res
@ -107,14 +108,14 @@ func TestCreateDatabase(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
if s := ts.Size(); s != 0 {
if s := qs.Size(); s != 0 {
t.Errorf("Unexpected size, got:%d expected:0", s)
}
ts.Close()
qs.Close()
err = createNewLevelDB("/dev/null/some terrible path", nil)
if err == nil {
@ -137,53 +138,53 @@ func TestLoadDatabase(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTriple(&graph.Triple{"Something", "points_to", "Something Else", "context"})
qs.AddTriple(&quad.Quad{"Something", "points_to", "Something Else", "context"})
for _, pq := range []string{"Something", "points_to", "Something Else", "context"} {
if got := ts.NameOf(ts.ValueOf(pq)); got != pq {
if got := qs.NameOf(qs.ValueOf(pq)); got != pq {
t.Errorf("Failed to roundtrip %q, got:%q expect:%q", pq, got, pq)
}
}
if s := ts.Size(); s != 1 {
if s := qs.Size(); s != 1 {
t.Errorf("Unexpected triplestore size, got:%d expect:1", s)
}
ts.Close()
qs.Close()
err = createNewLevelDB(tmpDir, nil)
if err != nil {
t.Fatal("Failed to create LevelDB database.")
}
ts, err = newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err = newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts2, didConvert := ts.(*TripleStore)
ts2, didConvert := qs.(*TripleStore)
if !didConvert {
t.Errorf("Could not convert from generic to LevelDB TripleStore")
}
ts.AddTripleSet(makeTripleSet())
if s := ts.Size(); s != 11 {
qs.AddTripleSet(makeTripleSet())
if s := qs.Size(); s != 11 {
t.Errorf("Unexpected triplestore size, got:%d expect:11", s)
}
if s := ts2.SizeOf(ts.ValueOf("B")); s != 5 {
if s := ts2.SizeOf(qs.ValueOf("B")); s != 5 {
t.Errorf("Unexpected triplestore size, got:%d expect:5", s)
}
ts.RemoveTriple(&graph.Triple{"A", "follows", "B", ""})
if s := ts.Size(); s != 10 {
qs.RemoveTriple(&quad.Quad{"A", "follows", "B", ""})
if s := qs.Size(); s != 10 {
t.Errorf("Unexpected triplestore size after RemoveTriple, got:%d expect:10", s)
}
if s := ts2.SizeOf(ts.ValueOf("B")); s != 4 {
if s := ts2.SizeOf(qs.ValueOf("B")); s != 4 {
t.Errorf("Unexpected triplestore size, got:%d expect:4", s)
}
ts.Close()
qs.Close()
}
func TestIterator(t *testing.T) {
@ -199,14 +200,14 @@ func TestIterator(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
var it graph.Iterator
it = ts.NodesAllIterator()
it = qs.NodesAllIterator()
if it == nil {
t.Fatal("Got nil iterator.")
}
@ -241,7 +242,7 @@ func TestIterator(t *testing.T) {
}
sort.Strings(expect)
for i := 0; i < 2; i++ {
got := iteratedNames(ts, it)
got := iteratedNames(qs, it)
sort.Strings(got)
if !reflect.DeepEqual(got, expect) {
t.Errorf("Unexpected iterated result on repeat %d, got:%v expect:%v", i, got, expect)
@ -250,23 +251,23 @@ func TestIterator(t *testing.T) {
}
for _, pq := range expect {
if !it.Check(ts.ValueOf(pq)) {
if !it.Check(qs.ValueOf(pq)) {
t.Errorf("Failed to find and check %q correctly", pq)
}
}
// FIXME(kortschak) Why does this fail?
/*
for _, pq := range []string{"baller"} {
if it.Check(ts.ValueOf(pq)) {
if it.Check(qs.ValueOf(pq)) {
t.Errorf("Failed to check %q correctly", pq)
}
}
*/
it.Reset()
it = ts.TriplesAllIterator()
edge, _ := it.Next()
triple := ts.Triple(edge)
it = qs.TriplesAllIterator()
edge, _ := graph.Next(it)
triple := qs.Quad(edge)
set := makeTripleSet()
var ok bool
for _, t := range set {
@ -279,7 +280,7 @@ func TestIterator(t *testing.T) {
t.Errorf("Failed to find %q during iteration, got:%q", triple, set)
}
ts.Close()
qs.Close()
}
func TestSetIterator(t *testing.T) {
@ -292,95 +293,95 @@ func TestSetIterator(t *testing.T) {
t.Fatalf("Failed to create working directory")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
defer ts.Close()
defer qs.Close()
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
expect := []*graph.Triple{
expect := []*quad.Quad{
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
}
sort.Sort(ordered(expect))
// Subject iterator.
it := ts.TripleIterator(graph.Subject, ts.ValueOf("C"))
it := qs.TripleIterator(quad.Subject, qs.ValueOf("C"))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results, got:%v expect:%v", got, expect)
}
it.Reset()
and := iterator.NewAnd()
and.AddSubIterator(ts.TriplesAllIterator())
and.AddSubIterator(qs.TriplesAllIterator())
and.AddSubIterator(it)
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
// Object iterator.
it = ts.TripleIterator(graph.Object, ts.ValueOf("F"))
it = qs.TripleIterator(quad.Object, qs.ValueOf("F"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "follows", "F", ""},
{"E", "follows", "F", ""},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results, got:%v expect:%v", got, expect)
}
and = iterator.NewAnd()
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
and.AddSubIterator(it)
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "follows", "F", ""},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
// Predicate iterator.
it = ts.TripleIterator(graph.Predicate, ts.ValueOf("status"))
it = qs.TripleIterator(quad.Predicate, qs.ValueOf("status"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
{"D", "status", "cool", "status_graph"},
{"G", "status", "cool", "status_graph"},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results from predicate iterator, got:%v expect:%v", got, expect)
}
// Provenance iterator.
it = ts.TripleIterator(graph.Provenance, ts.ValueOf("status_graph"))
// Label iterator.
it = qs.TripleIterator(quad.Label, qs.ValueOf("status_graph"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
{"D", "status", "cool", "status_graph"},
{"G", "status", "cool", "status_graph"},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results from predicate iterator, got:%v expect:%v", got, expect)
}
it.Reset()
// Order is important
and = iterator.NewAnd()
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
and.AddSubIterator(it)
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
it.Reset()
@ -388,12 +389,12 @@ func TestSetIterator(t *testing.T) {
// Order is important
and = iterator.NewAnd()
and.AddSubIterator(it)
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
}
@ -406,17 +407,17 @@ func TestOptimize(t *testing.T) {
if err != nil {
t.Fatalf("Failed to create working directory")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
// With an linksto-fixed pair
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("F"))
fixed.AddTag("internal")
lto := iterator.NewLinksTo(ts, fixed, graph.Object)
fixed := qs.FixedIterator()
fixed.Add(qs.ValueOf("F"))
fixed.Tagger().Add("internal")
lto := iterator.NewLinksTo(qs, fixed, quad.Object)
oldIt := lto.Clone()
newIt, ok := lto.Optimize()
@ -427,16 +428,16 @@ func TestOptimize(t *testing.T) {
t.Errorf("Optimized iterator type does not match original, got:%v expect:%v", newIt.Type(), Type())
}
newTriples := iteratedTriples(ts, newIt)
oldTriples := iteratedTriples(ts, oldIt)
newTriples := iteratedTriples(qs, newIt)
oldTriples := iteratedTriples(qs, oldIt)
if !reflect.DeepEqual(newTriples, oldTriples) {
t.Errorf("Optimized iteration does not match original")
}
oldIt.Next()
graph.Next(oldIt)
oldResults := make(map[string]graph.Value)
oldIt.TagResults(oldResults)
newIt.Next()
graph.Next(newIt)
newResults := make(map[string]graph.Value)
newIt.TagResults(newResults)
if !reflect.DeepEqual(newResults, oldResults) {

View file

@ -30,6 +30,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
const (
@ -56,110 +57,110 @@ func createNewLevelDB(path string, _ graph.Options) error {
return err
}
defer db.Close()
ts := &TripleStore{}
ts.db = db
ts.writeopts = &opt.WriteOptions{
qs := &TripleStore{}
qs.db = db
qs.writeopts = &opt.WriteOptions{
Sync: true,
}
ts.Close()
qs.Close()
return nil
}
func newTripleStore(path string, options graph.Options) (graph.TripleStore, error) {
var ts TripleStore
ts.path = path
var qs TripleStore
qs.path = path
cache_size := DefaultCacheSize
if val, ok := options.IntKey("cache_size_mb"); ok {
cache_size = val
}
ts.dbOpts = &opt.Options{
qs.dbOpts = &opt.Options{
BlockCache: cache.NewLRUCache(cache_size * opt.MiB),
}
ts.dbOpts.ErrorIfMissing = true
qs.dbOpts.ErrorIfMissing = true
write_buffer_mb := DefaultWriteBufferSize
if val, ok := options.IntKey("write_buffer_mb"); ok {
write_buffer_mb = val
}
ts.dbOpts.WriteBuffer = write_buffer_mb * opt.MiB
ts.hasher = sha1.New()
ts.writeopts = &opt.WriteOptions{
qs.dbOpts.WriteBuffer = write_buffer_mb * opt.MiB
qs.hasher = sha1.New()
qs.writeopts = &opt.WriteOptions{
Sync: false,
}
ts.readopts = &opt.ReadOptions{}
db, err := leveldb.OpenFile(ts.path, ts.dbOpts)
qs.readopts = &opt.ReadOptions{}
db, err := leveldb.OpenFile(qs.path, qs.dbOpts)
if err != nil {
panic("Error, couldn't open! " + err.Error())
}
ts.db = db
glog.Infoln(ts.GetStats())
ts.getSize()
return &ts, nil
qs.db = db
glog.Infoln(qs.GetStats())
qs.getSize()
return &qs, nil
}
func (ts *TripleStore) GetStats() string {
func (qs *TripleStore) GetStats() string {
out := ""
stats, err := ts.db.GetProperty("leveldb.stats")
stats, err := qs.db.GetProperty("leveldb.stats")
if err == nil {
out += fmt.Sprintln("Stats: ", stats)
}
out += fmt.Sprintln("Size: ", ts.size)
out += fmt.Sprintln("Size: ", qs.size)
return out
}
func (ts *TripleStore) Size() int64 {
return ts.size
func (qs *TripleStore) Size() int64 {
return qs.size
}
func (ts *TripleStore) createKeyFor(d [3]graph.Direction, triple *graph.Triple) []byte {
key := make([]byte, 0, 2+(ts.hasher.Size()*3))
func (qs *TripleStore) createKeyFor(d [3]quad.Direction, triple *quad.Quad) []byte {
key := make([]byte, 0, 2+(qs.hasher.Size()*3))
// TODO(kortschak) Remove dependence on String() method.
key = append(key, []byte{d[0].Prefix(), d[1].Prefix()}...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[0]))...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[1]))...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[2]))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[0]))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[1]))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[2]))...)
return key
}
func (ts *TripleStore) createProvKeyFor(d [3]graph.Direction, triple *graph.Triple) []byte {
key := make([]byte, 0, 2+(ts.hasher.Size()*4))
func (qs *TripleStore) createProvKeyFor(d [3]quad.Direction, triple *quad.Quad) []byte {
key := make([]byte, 0, 2+(qs.hasher.Size()*4))
// TODO(kortschak) Remove dependence on String() method.
key = append(key, []byte{graph.Provenance.Prefix(), d[0].Prefix()}...)
key = append(key, ts.convertStringToByteHash(triple.Get(graph.Provenance))...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[0]))...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[1]))...)
key = append(key, ts.convertStringToByteHash(triple.Get(d[2]))...)
key = append(key, []byte{quad.Label.Prefix(), d[0].Prefix()}...)
key = append(key, qs.convertStringToByteHash(triple.Get(quad.Label))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[0]))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[1]))...)
key = append(key, qs.convertStringToByteHash(triple.Get(d[2]))...)
return key
}
func (ts *TripleStore) createValueKeyFor(s string) []byte {
key := make([]byte, 0, 1+ts.hasher.Size())
func (qs *TripleStore) createValueKeyFor(s string) []byte {
key := make([]byte, 0, 1+qs.hasher.Size())
key = append(key, []byte("z")...)
key = append(key, ts.convertStringToByteHash(s)...)
key = append(key, qs.convertStringToByteHash(s)...)
return key
}
func (ts *TripleStore) AddTriple(t *graph.Triple) {
func (qs *TripleStore) AddTriple(t *quad.Quad) {
batch := &leveldb.Batch{}
ts.buildWrite(batch, t)
err := ts.db.Write(batch, ts.writeopts)
qs.buildWrite(batch, t)
err := qs.db.Write(batch, qs.writeopts)
if err != nil {
glog.Errorf("Couldn't write to DB for triple %s", t)
return
}
ts.size++
qs.size++
}
// Short hand for direction permutations.
var (
spo = [3]graph.Direction{graph.Subject, graph.Predicate, graph.Object}
osp = [3]graph.Direction{graph.Object, graph.Subject, graph.Predicate}
pos = [3]graph.Direction{graph.Predicate, graph.Object, graph.Subject}
pso = [3]graph.Direction{graph.Predicate, graph.Subject, graph.Object}
spo = [3]quad.Direction{quad.Subject, quad.Predicate, quad.Object}
osp = [3]quad.Direction{quad.Object, quad.Subject, quad.Predicate}
pos = [3]quad.Direction{quad.Predicate, quad.Object, quad.Subject}
pso = [3]quad.Direction{quad.Predicate, quad.Subject, quad.Object}
)
func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
_, err := ts.db.Get(ts.createKeyFor(spo, t), ts.readopts)
func (qs *TripleStore) RemoveTriple(t *quad.Quad) {
_, err := qs.db.Get(qs.createKeyFor(spo, t), qs.readopts)
if err != nil && err != leveldb.ErrNotFound {
glog.Errorf("Couldn't access DB to confirm deletion")
return
@ -169,45 +170,45 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
return
}
batch := &leveldb.Batch{}
batch.Delete(ts.createKeyFor(spo, t))
batch.Delete(ts.createKeyFor(osp, t))
batch.Delete(ts.createKeyFor(pos, t))
ts.UpdateValueKeyBy(t.Get(graph.Subject), -1, batch)
ts.UpdateValueKeyBy(t.Get(graph.Predicate), -1, batch)
ts.UpdateValueKeyBy(t.Get(graph.Object), -1, batch)
if t.Get(graph.Provenance) != "" {
batch.Delete(ts.createProvKeyFor(pso, t))
ts.UpdateValueKeyBy(t.Get(graph.Provenance), -1, batch)
batch.Delete(qs.createKeyFor(spo, t))
batch.Delete(qs.createKeyFor(osp, t))
batch.Delete(qs.createKeyFor(pos, t))
qs.UpdateValueKeyBy(t.Get(quad.Subject), -1, batch)
qs.UpdateValueKeyBy(t.Get(quad.Predicate), -1, batch)
qs.UpdateValueKeyBy(t.Get(quad.Object), -1, batch)
if t.Get(quad.Label) != "" {
batch.Delete(qs.createProvKeyFor(pso, t))
qs.UpdateValueKeyBy(t.Get(quad.Label), -1, batch)
}
err = ts.db.Write(batch, nil)
err = qs.db.Write(batch, nil)
if err != nil {
glog.Errorf("Couldn't delete triple %s", t)
return
}
ts.size--
qs.size--
}
func (ts *TripleStore) buildTripleWrite(batch *leveldb.Batch, t *graph.Triple) {
func (qs *TripleStore) buildTripleWrite(batch *leveldb.Batch, t *quad.Quad) {
bytes, err := json.Marshal(*t)
if err != nil {
glog.Errorf("Couldn't write to buffer for triple %s\n %s\n", t, err)
return
}
batch.Put(ts.createKeyFor(spo, t), bytes)
batch.Put(ts.createKeyFor(osp, t), bytes)
batch.Put(ts.createKeyFor(pos, t), bytes)
if t.Get(graph.Provenance) != "" {
batch.Put(ts.createProvKeyFor(pso, t), bytes)
batch.Put(qs.createKeyFor(spo, t), bytes)
batch.Put(qs.createKeyFor(osp, t), bytes)
batch.Put(qs.createKeyFor(pos, t), bytes)
if t.Get(quad.Label) != "" {
batch.Put(qs.createProvKeyFor(pso, t), bytes)
}
}
func (ts *TripleStore) buildWrite(batch *leveldb.Batch, t *graph.Triple) {
ts.buildTripleWrite(batch, t)
ts.UpdateValueKeyBy(t.Get(graph.Subject), 1, nil)
ts.UpdateValueKeyBy(t.Get(graph.Predicate), 1, nil)
ts.UpdateValueKeyBy(t.Get(graph.Object), 1, nil)
if t.Get(graph.Provenance) != "" {
ts.UpdateValueKeyBy(t.Get(graph.Provenance), 1, nil)
func (qs *TripleStore) buildWrite(batch *leveldb.Batch, t *quad.Quad) {
qs.buildTripleWrite(batch, t)
qs.UpdateValueKeyBy(t.Get(quad.Subject), 1, nil)
qs.UpdateValueKeyBy(t.Get(quad.Predicate), 1, nil)
qs.UpdateValueKeyBy(t.Get(quad.Object), 1, nil)
if t.Get(quad.Label) != "" {
qs.UpdateValueKeyBy(t.Get(quad.Label), 1, nil)
}
}
@ -216,10 +217,10 @@ type ValueData struct {
Size int64
}
func (ts *TripleStore) UpdateValueKeyBy(name string, amount int, batch *leveldb.Batch) {
func (qs *TripleStore) UpdateValueKeyBy(name string, amount int, batch *leveldb.Batch) {
value := &ValueData{name, int64(amount)}
key := ts.createValueKeyFor(name)
b, err := ts.db.Get(key, ts.readopts)
key := qs.createValueKeyFor(name)
b, err := qs.db.Get(key, qs.readopts)
// Error getting the node from the database.
if err != nil && err != leveldb.ErrNotFound {
@ -241,7 +242,7 @@ func (ts *TripleStore) UpdateValueKeyBy(name string, amount int, batch *leveldb.
if amount < 0 {
if value.Size <= 0 {
if batch == nil {
ts.db.Delete(key, ts.writeopts)
qs.db.Delete(key, qs.writeopts)
} else {
batch.Delete(key)
}
@ -256,88 +257,88 @@ func (ts *TripleStore) UpdateValueKeyBy(name string, amount int, batch *leveldb.
return
}
if batch == nil {
ts.db.Put(key, bytes, ts.writeopts)
qs.db.Put(key, bytes, qs.writeopts)
} else {
batch.Put(key, bytes)
}
}
func (ts *TripleStore) AddTripleSet(t_s []*graph.Triple) {
func (qs *TripleStore) AddTripleSet(t_s []*quad.Quad) {
batch := &leveldb.Batch{}
newTs := len(t_s)
resizeMap := make(map[string]int)
for _, t := range t_s {
ts.buildTripleWrite(batch, t)
qs.buildTripleWrite(batch, t)
resizeMap[t.Subject]++
resizeMap[t.Predicate]++
resizeMap[t.Object]++
if t.Provenance != "" {
resizeMap[t.Provenance]++
if t.Label != "" {
resizeMap[t.Label]++
}
}
for k, v := range resizeMap {
ts.UpdateValueKeyBy(k, v, batch)
qs.UpdateValueKeyBy(k, v, batch)
}
err := ts.db.Write(batch, ts.writeopts)
err := qs.db.Write(batch, qs.writeopts)
if err != nil {
glog.Errorf("Couldn't write to DB for tripleset")
return
}
ts.size += int64(newTs)
qs.size += int64(newTs)
}
func (ts *TripleStore) Close() {
func (qs *TripleStore) Close() {
buf := new(bytes.Buffer)
err := binary.Write(buf, binary.LittleEndian, ts.size)
err := binary.Write(buf, binary.LittleEndian, qs.size)
if err == nil {
werr := ts.db.Put([]byte("__size"), buf.Bytes(), ts.writeopts)
werr := qs.db.Put([]byte("__size"), buf.Bytes(), qs.writeopts)
if werr != nil {
glog.Errorf("Couldn't write size before closing!")
}
} else {
glog.Errorf("Couldn't convert size before closing!")
}
ts.db.Close()
ts.open = false
qs.db.Close()
qs.open = false
}
func (ts *TripleStore) Triple(k graph.Value) *graph.Triple {
var triple graph.Triple
b, err := ts.db.Get(k.([]byte), ts.readopts)
func (qs *TripleStore) Quad(k graph.Value) *quad.Quad {
var triple quad.Quad
b, err := qs.db.Get(k.([]byte), qs.readopts)
if err != nil && err != leveldb.ErrNotFound {
glog.Errorln("Error: couldn't get triple from DB")
return &graph.Triple{}
return &quad.Quad{}
}
if err == leveldb.ErrNotFound {
// No harm, no foul.
return &graph.Triple{}
return &quad.Quad{}
}
err = json.Unmarshal(b, &triple)
if err != nil {
glog.Errorln("Error: couldn't reconstruct triple")
return &graph.Triple{}
return &quad.Quad{}
}
return &triple
}
func (ts *TripleStore) convertStringToByteHash(s string) []byte {
ts.hasher.Reset()
key := make([]byte, 0, ts.hasher.Size())
ts.hasher.Write([]byte(s))
key = ts.hasher.Sum(key)
func (qs *TripleStore) convertStringToByteHash(s string) []byte {
qs.hasher.Reset()
key := make([]byte, 0, qs.hasher.Size())
qs.hasher.Write([]byte(s))
key = qs.hasher.Sum(key)
return key
}
func (ts *TripleStore) ValueOf(s string) graph.Value {
return ts.createValueKeyFor(s)
func (qs *TripleStore) ValueOf(s string) graph.Value {
return qs.createValueKeyFor(s)
}
func (ts *TripleStore) valueData(value_key []byte) ValueData {
func (qs *TripleStore) valueData(value_key []byte) ValueData {
var out ValueData
if glog.V(3) {
glog.V(3).Infof("%s %v\n", string(value_key[0]), value_key)
}
b, err := ts.db.Get(value_key, ts.readopts)
b, err := qs.db.Get(value_key, qs.readopts)
if err != nil && err != leveldb.ErrNotFound {
glog.Errorln("Error: couldn't get value from DB")
return out
@ -352,30 +353,30 @@ func (ts *TripleStore) valueData(value_key []byte) ValueData {
return out
}
func (ts *TripleStore) NameOf(k graph.Value) string {
func (qs *TripleStore) NameOf(k graph.Value) string {
if k == nil {
glog.V(2).Infoln("k was nil")
return ""
}
return ts.valueData(k.([]byte)).Name
return qs.valueData(k.([]byte)).Name
}
func (ts *TripleStore) SizeOf(k graph.Value) int64 {
func (qs *TripleStore) SizeOf(k graph.Value) int64 {
if k == nil {
return 0
}
return int64(ts.valueData(k.([]byte)).Size)
return int64(qs.valueData(k.([]byte)).Size)
}
func (ts *TripleStore) getSize() {
func (qs *TripleStore) getSize() {
var size int64
b, err := ts.db.Get([]byte("__size"), ts.readopts)
b, err := qs.db.Get([]byte("__size"), qs.readopts)
if err != nil && err != leveldb.ErrNotFound {
panic("Couldn't read size " + err.Error())
}
if err == leveldb.ErrNotFound {
// Must be a new database. Cool
ts.size = 0
qs.size = 0
return
}
buf := bytes.NewBuffer(b)
@ -383,10 +384,10 @@ func (ts *TripleStore) getSize() {
if err != nil {
glog.Errorln("Error: couldn't parse size")
}
ts.size = size
qs.size = size
}
func (ts *TripleStore) SizeOfPrefix(pre []byte) (int64, error) {
func (qs *TripleStore) SizeOfPrefix(pre []byte) (int64, error) {
limit := make([]byte, len(pre))
copy(limit, pre)
end := len(limit) - 1
@ -394,45 +395,45 @@ func (ts *TripleStore) SizeOfPrefix(pre []byte) (int64, error) {
ranges := make([]util.Range, 1)
ranges[0].Start = pre
ranges[0].Limit = limit
sizes, err := ts.db.SizeOf(ranges)
sizes, err := qs.db.SizeOf(ranges)
if err == nil {
return (int64(sizes[0]) >> 6) + 1, nil
}
return 0, nil
}
func (ts *TripleStore) TripleIterator(d graph.Direction, val graph.Value) graph.Iterator {
func (qs *TripleStore) TripleIterator(d quad.Direction, val graph.Value) graph.Iterator {
var prefix string
switch d {
case graph.Subject:
case quad.Subject:
prefix = "sp"
case graph.Predicate:
case quad.Predicate:
prefix = "po"
case graph.Object:
case quad.Object:
prefix = "os"
case graph.Provenance:
case quad.Label:
prefix = "cp"
default:
panic("unreachable " + d.String())
}
return NewIterator(prefix, d, val, ts)
return NewIterator(prefix, d, val, qs)
}
func (ts *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator("z", graph.Any, ts)
func (qs *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator("z", quad.Any, qs)
}
func (ts *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator("po", graph.Predicate, ts)
func (qs *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator("po", quad.Predicate, qs)
}
func (ts *TripleStore) TripleDirection(val graph.Value, d graph.Direction) graph.Value {
func (qs *TripleStore) TripleDirection(val graph.Value, d quad.Direction) graph.Value {
v := val.([]uint8)
offset := PositionOf(v[0:2], d, ts)
offset := PositionOf(v[0:2], d, qs)
if offset != -1 {
return append([]byte("z"), v[offset:offset+ts.hasher.Size()]...)
return append([]byte("z"), v[offset:offset+qs.hasher.Size()]...)
} else {
return ts.Triple(val).Get(d)
return qs.Quad(val).Get(d)
}
}
@ -440,7 +441,7 @@ func compareBytes(a, b graph.Value) bool {
return bytes.Equal(a.([]uint8), b.([]uint8))
}
func (ts *TripleStore) FixedIterator() graph.FixedIterator {
func (qs *TripleStore) FixedIterator() graph.FixedIterator {
return iterator.NewFixedIteratorWithCompare(compareBytes)
}

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
it.Close()
return newIt, true

View file

@ -31,6 +31,11 @@ func NewMemstoreAllIterator(ts *TripleStore) *AllIterator {
return &out
}
// No subiterators.
func (it *AllIterator) SubIterators() []graph.Iterator {
return nil
}
func (it *AllIterator) Next() (graph.Value, bool) {
next, out := it.Int64.Next()
if !out {
@ -41,6 +46,5 @@ func (it *AllIterator) Next() (graph.Value, bool) {
if !ok {
return it.Next()
}
it.Last = next
return next, out
}

View file

@ -26,11 +26,13 @@ import (
)
type Iterator struct {
iterator.Base
uid uint64
tags graph.Tagger
tree *llrb.LLRB
data string
isRunning bool
iterLast Int64
result graph.Value
}
type Int64 int64
@ -53,34 +55,69 @@ func IterateOne(tree *llrb.LLRB, last Int64) Int64 {
}
func NewLlrbIterator(tree *llrb.LLRB, data string) *Iterator {
var it Iterator
iterator.BaseInit(&it.Base)
it.tree = tree
it.iterLast = Int64(-1)
it.data = data
return &it
return &Iterator{
uid: iterator.NextUID(),
tree: tree,
iterLast: Int64(-1),
data: data,
}
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
it.iterLast = Int64(-1)
}
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
var new_it = NewLlrbIterator(it.tree, it.data)
new_it.CopyTagsFrom(it)
return new_it
m := NewLlrbIterator(it.tree, it.data)
m.tags.CopyFrom(it)
return m
}
func (it *Iterator) Close() {}
func (it *Iterator) Next() (graph.Value, bool) {
graph.NextLogIn(it)
if it.tree.Max() == nil || it.Last == int64(it.tree.Max().(Int64)) {
if it.tree.Max() == nil || it.result == int64(it.tree.Max().(Int64)) {
return graph.NextLogOut(it, nil, false)
}
it.iterLast = IterateOne(it.tree, it.iterLast)
it.Last = int64(it.iterLast)
return graph.NextLogOut(it, it.Last, true)
it.result = int64(it.iterLast)
return graph.NextLogOut(it, it.result, true)
}
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func (it *Iterator) Size() (int64, bool) {
@ -90,7 +127,7 @@ func (it *Iterator) Size() (int64, bool) {
func (it *Iterator) Check(v graph.Value) bool {
graph.CheckLogIn(it, v)
if it.tree.Has(Int64(v.(int64))) {
it.Last = v
it.result = v
return graph.CheckLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)
@ -98,7 +135,7 @@ func (it *Iterator) Check(v graph.Value) bool {
func (it *Iterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s tags:%s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.Tags(), size, it.data)
return fmt.Sprintf("%s(%s tags:%s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags(), size, it.data)
}
var memType graph.Type

View file

@ -20,15 +20,16 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
"github.com/petar/GoLLRB/llrb"
)
type TripleDirectionIndex struct {
subject map[int64]*llrb.LLRB
predicate map[int64]*llrb.LLRB
object map[int64]*llrb.LLRB
provenance map[int64]*llrb.LLRB
subject map[int64]*llrb.LLRB
predicate map[int64]*llrb.LLRB
object map[int64]*llrb.LLRB
label map[int64]*llrb.LLRB
}
func NewTripleDirectionIndex() *TripleDirectionIndex {
@ -36,25 +37,25 @@ func NewTripleDirectionIndex() *TripleDirectionIndex {
tdi.subject = make(map[int64]*llrb.LLRB)
tdi.predicate = make(map[int64]*llrb.LLRB)
tdi.object = make(map[int64]*llrb.LLRB)
tdi.provenance = make(map[int64]*llrb.LLRB)
tdi.label = make(map[int64]*llrb.LLRB)
return &tdi
}
func (tdi *TripleDirectionIndex) GetForDir(d graph.Direction) map[int64]*llrb.LLRB {
func (tdi *TripleDirectionIndex) GetForDir(d quad.Direction) map[int64]*llrb.LLRB {
switch d {
case graph.Subject:
case quad.Subject:
return tdi.subject
case graph.Object:
case quad.Object:
return tdi.object
case graph.Predicate:
case quad.Predicate:
return tdi.predicate
case graph.Provenance:
return tdi.provenance
case quad.Label:
return tdi.label
}
panic("illegal direction")
}
func (tdi *TripleDirectionIndex) GetOrCreate(d graph.Direction, id int64) *llrb.LLRB {
func (tdi *TripleDirectionIndex) GetOrCreate(d quad.Direction, id int64) *llrb.LLRB {
directionIndex := tdi.GetForDir(d)
if _, ok := directionIndex[id]; !ok {
directionIndex[id] = llrb.New()
@ -62,7 +63,7 @@ func (tdi *TripleDirectionIndex) GetOrCreate(d graph.Direction, id int64) *llrb.
return directionIndex[id]
}
func (tdi *TripleDirectionIndex) Get(d graph.Direction, id int64) (*llrb.LLRB, bool) {
func (tdi *TripleDirectionIndex) Get(d quad.Direction, id int64) (*llrb.LLRB, bool) {
directionIndex := tdi.GetForDir(d)
tree, exists := directionIndex[id]
return tree, exists
@ -73,7 +74,7 @@ type TripleStore struct {
tripleIdCounter int64
idMap map[string]int64
revIdMap map[int64]string
triples []graph.Triple
triples []quad.Quad
size int64
index TripleDirectionIndex
// vip_index map[string]map[int64]map[string]map[int64]*llrb.Tree
@ -83,10 +84,10 @@ func newTripleStore() *TripleStore {
var ts TripleStore
ts.idMap = make(map[string]int64)
ts.revIdMap = make(map[int64]string)
ts.triples = make([]graph.Triple, 1, 200)
ts.triples = make([]quad.Quad, 1, 200)
// Sentinel null triple so triple indices start at 1
ts.triples[0] = graph.Triple{}
ts.triples[0] = quad.Quad{}
ts.size = 1
ts.index = *NewTripleDirectionIndex()
ts.idCounter = 1
@ -94,18 +95,18 @@ func newTripleStore() *TripleStore {
return &ts
}
func (ts *TripleStore) AddTripleSet(triples []*graph.Triple) {
func (ts *TripleStore) AddTripleSet(triples []*quad.Quad) {
for _, t := range triples {
ts.AddTriple(t)
}
}
func (ts *TripleStore) tripleExists(t *graph.Triple) (bool, int64) {
func (ts *TripleStore) tripleExists(t *quad.Quad) (bool, int64) {
smallest := -1
var smallest_tree *llrb.LLRB
for d := graph.Subject; d <= graph.Provenance; d++ {
for d := quad.Subject; d <= quad.Label; d++ {
sid := t.Get(d)
if d == graph.Provenance && sid == "" {
if d == quad.Label && sid == "" {
continue
}
id, ok := ts.idMap[sid]
@ -137,7 +138,7 @@ func (ts *TripleStore) tripleExists(t *graph.Triple) (bool, int64) {
return false, 0
}
func (ts *TripleStore) AddTriple(t *graph.Triple) {
func (ts *TripleStore) AddTriple(t *quad.Quad) {
if exists, _ := ts.tripleExists(t); exists {
return
}
@ -147,9 +148,9 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
ts.size++
ts.tripleIdCounter++
for d := graph.Subject; d <= graph.Provenance; d++ {
for d := quad.Subject; d <= quad.Label; d++ {
sid := t.Get(d)
if d == graph.Provenance && sid == "" {
if d == quad.Label && sid == "" {
continue
}
if _, ok := ts.idMap[sid]; !ok {
@ -159,8 +160,8 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
}
}
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id := ts.idMap[t.Get(d)]
@ -171,7 +172,7 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
// TODO(barakmich): Add VIP indexing
}
func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
func (ts *TripleStore) RemoveTriple(t *quad.Quad) {
var tripleID int64
var exists bool
tripleID = 0
@ -179,11 +180,11 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
return
}
ts.triples[tripleID] = graph.Triple{}
ts.triples[tripleID] = quad.Quad{}
ts.size--
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id := ts.idMap[t.Get(d)]
@ -191,8 +192,8 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
tree.Delete(Int64(tripleID))
}
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id, ok := ts.idMap[t.Get(d)]
@ -200,8 +201,8 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
continue
}
stillExists := false
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
nodeTree := ts.index.GetOrCreate(d, id)
@ -217,11 +218,11 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
}
}
func (ts *TripleStore) Triple(index graph.Value) *graph.Triple {
func (ts *TripleStore) Quad(index graph.Value) *quad.Quad {
return &ts.triples[index.(int64)]
}
func (ts *TripleStore) TripleIterator(d graph.Direction, value graph.Value) graph.Iterator {
func (ts *TripleStore) TripleIterator(d quad.Direction, value graph.Value) graph.Iterator {
index, ok := ts.index.Get(d, value.(int64))
data := fmt.Sprintf("dir:%s val:%d", d, value.(int64))
if ok {
@ -259,8 +260,8 @@ func (ts *TripleStore) FixedIterator() graph.FixedIterator {
return iterator.NewFixedIteratorWithCompare(iterator.BasicEquality)
}
func (ts *TripleStore) TripleDirection(val graph.Value, d graph.Direction) graph.Value {
name := ts.Triple(val).Get(d)
func (ts *TripleStore) TripleDirection(val graph.Value, d quad.Direction) graph.Value {
name := ts.Quad(val).Get(d)
return ts.ValueOf(name)
}

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
return newIt, true
}

View file

@ -21,6 +21,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
// This is a simple test graph.
@ -36,7 +37,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +51,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestStore(data []*graph.Triple) (*TripleStore, []pair) {
func makeTestStore(data []*quad.Quad) (*TripleStore, []pair) {
seen := make(map[string]struct{})
ts := newTripleStore()
var (
@ -58,7 +59,7 @@ func makeTestStore(data []*graph.Triple) (*TripleStore, []pair) {
ind []pair
)
for _, t := range data {
for _, qp := range []string{t.Subject, t.Predicate, t.Object, t.Provenance} {
for _, qp := range []string{t.Subject, t.Predicate, t.Object, t.Label} {
if _, ok := seen[qp]; !ok && qp != "" {
val++
ind = append(ind, pair{qp, val})
@ -105,10 +106,10 @@ func TestIteratorsAndNextResultOrderA(t *testing.T) {
all := ts.NodesAllIterator()
innerAnd := iterator.NewAnd()
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, graph.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Object))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, quad.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Object))
hasa := iterator.NewHasA(ts, innerAnd, graph.Subject)
hasa := iterator.NewHasA(ts, innerAnd, quad.Subject)
outerAnd := iterator.NewAnd()
outerAnd.AddSubIterator(fixed)
outerAnd.AddSubIterator(hasa)
@ -149,8 +150,8 @@ func TestLinksToOptimization(t *testing.T) {
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("cool"))
lto := iterator.NewLinksTo(ts, fixed, graph.Object)
lto.AddTag("foo")
lto := iterator.NewLinksTo(ts, fixed, quad.Object)
lto.Tagger().Add("foo")
newIt, changed := lto.Optimize()
if !changed {
@ -165,7 +166,8 @@ func TestLinksToOptimization(t *testing.T) {
if v_clone.DebugString(0) != v.DebugString(0) {
t.Fatal("Wrong iterator. Got ", v_clone.DebugString(0))
}
if len(v_clone.Tags()) < 1 || v_clone.Tags()[0] != "foo" {
vt := v_clone.Tagger()
if len(vt.Tags()) < 1 || vt.Tags()[0] != "foo" {
t.Fatal("Tag on LinksTo did not persist")
}
}
@ -173,7 +175,7 @@ func TestLinksToOptimization(t *testing.T) {
func TestRemoveTriple(t *testing.T) {
ts, _ := makeTestStore(simpleGraph)
ts.RemoveTriple(&graph.Triple{"E", "follows", "F", ""})
ts.RemoveTriple(&quad.Quad{"E", "follows", "F", ""})
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("E"))
@ -182,13 +184,13 @@ func TestRemoveTriple(t *testing.T) {
fixed2.Add(ts.ValueOf("follows"))
innerAnd := iterator.NewAnd()
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, graph.Subject))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, graph.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, quad.Subject))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, quad.Predicate))
hasa := iterator.NewHasA(ts, innerAnd, graph.Object)
hasa := iterator.NewHasA(ts, innerAnd, quad.Object)
newIt, _ := hasa.Optimize()
_, ok := newIt.Next()
_, ok := graph.Next(newIt)
if ok {
t.Error("E should not have any followers.")
}

View file

@ -24,12 +24,14 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type Iterator struct {
iterator.Base
ts *TripleStore
dir graph.Direction
uid uint64
tags graph.Tagger
qs *TripleStore
dir quad.Direction
iter *mgo.Iter
hash string
name string
@ -37,60 +39,73 @@ type Iterator struct {
isAll bool
constraint bson.M
collection string
result graph.Value
}
func NewIterator(ts *TripleStore, collection string, d graph.Direction, val graph.Value) *Iterator {
var m Iterator
iterator.BaseInit(&m.Base)
func NewIterator(qs *TripleStore, collection string, d quad.Direction, val graph.Value) *Iterator {
name := qs.NameOf(val)
m.name = ts.NameOf(val)
m.collection = collection
var constraint bson.M
switch d {
case graph.Subject:
m.constraint = bson.M{"Subject": m.name}
case graph.Predicate:
m.constraint = bson.M{"Predicate": m.name}
case graph.Object:
m.constraint = bson.M{"Object": m.name}
case graph.Provenance:
m.constraint = bson.M{"Provenance": m.name}
case quad.Subject:
constraint = bson.M{"Subject": name}
case quad.Predicate:
constraint = bson.M{"Predicate": name}
case quad.Object:
constraint = bson.M{"Object": name}
case quad.Label:
constraint = bson.M{"Label": name}
}
m.ts = ts
m.dir = d
m.iter = ts.db.C(collection).Find(m.constraint).Iter()
size, err := ts.db.C(collection).Find(m.constraint).Count()
size, err := qs.db.C(collection).Find(constraint).Count()
if err != nil {
// FIXME(kortschak) This should be passed back rather than just logging.
glog.Errorln("Trouble getting size for iterator! ", err)
return nil
}
m.size = int64(size)
m.hash = val.(string)
m.isAll = false
return &m
return &Iterator{
uid: iterator.NextUID(),
name: name,
constraint: constraint,
collection: collection,
qs: qs,
dir: d,
iter: qs.db.C(collection).Find(constraint).Iter(),
size: int64(size),
hash: val.(string),
isAll: false,
}
}
func NewAllIterator(ts *TripleStore, collection string) *Iterator {
var m Iterator
m.ts = ts
m.dir = graph.Any
m.constraint = nil
m.collection = collection
m.iter = ts.db.C(collection).Find(nil).Iter()
size, err := ts.db.C(collection).Count()
func NewAllIterator(qs *TripleStore, collection string) *Iterator {
size, err := qs.db.C(collection).Count()
if err != nil {
// FIXME(kortschak) This should be passed back rather than just logging.
glog.Errorln("Trouble getting size for iterator! ", err)
return nil
}
m.size = int64(size)
m.hash = ""
m.isAll = true
return &m
return &Iterator{
uid: iterator.NextUID(),
qs: qs,
dir: quad.Any,
constraint: nil,
collection: collection,
iter: qs.db.C(collection).Find(nil).Iter(),
size: int64(size),
hash: "",
isAll: true,
}
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
it.iter.Close()
it.iter = it.ts.db.C(it.collection).Find(it.constraint).Iter()
it.iter = it.qs.db.C(it.collection).Find(it.constraint).Iter()
}
@ -98,15 +113,29 @@ func (it *Iterator) Close() {
it.iter.Close()
}
func (it *Iterator) Clone() graph.Iterator {
var newM graph.Iterator
if it.isAll {
newM = NewAllIterator(it.ts, it.collection)
} else {
newM = NewIterator(it.ts, it.collection, it.dir, it.hash)
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
newM.CopyTagsFrom(it)
return newM
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
var m *Iterator
if it.isAll {
m = NewAllIterator(it.qs, it.collection)
} else {
m = NewIterator(it.qs, it.collection, it.dir, it.hash)
}
m.tags.CopyFrom(it)
return m
}
func (it *Iterator) Next() (graph.Value, bool) {
@ -124,30 +153,47 @@ func (it *Iterator) Next() (graph.Value, bool) {
}
return nil, false
}
it.Last = result.Id
it.result = result.Id
return result.Id, true
}
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func (it *Iterator) Check(v graph.Value) bool {
graph.CheckLogIn(it, v)
if it.isAll {
it.Last = v
it.result = v
return graph.CheckLogOut(it, v, true)
}
var offset int
switch it.dir {
case graph.Subject:
case quad.Subject:
offset = 0
case graph.Predicate:
offset = (it.ts.hasher.Size() * 2)
case graph.Object:
offset = (it.ts.hasher.Size() * 2) * 2
case graph.Provenance:
offset = (it.ts.hasher.Size() * 2) * 3
case quad.Predicate:
offset = (it.qs.hasher.Size() * 2)
case quad.Object:
offset = (it.qs.hasher.Size() * 2) * 2
case quad.Label:
offset = (it.qs.hasher.Size() * 2) * 3
}
val := v.(string)[offset : it.ts.hasher.Size()*2+offset]
val := v.(string)[offset : it.qs.hasher.Size()*2+offset]
if val == it.hash {
it.Last = v
it.result = v
return graph.CheckLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)

View file

@ -18,6 +18,7 @@ import (
"crypto/sha1"
"encoding/hex"
"hash"
"io"
"log"
"gopkg.in/mgo.v2"
@ -26,8 +27,16 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func init() {
graph.RegisterTripleStore("mongo", newTripleStore, createNewMongoGraph)
}
// Guarantee we satisfy graph.Bulkloader.
var _ graph.BulkLoader = (*TripleStore)(nil)
const DefaultDBName = "cayley"
type TripleStore struct {
@ -60,13 +69,13 @@ func createNewMongoGraph(addr string, options graph.Options) error {
db.C("triples").EnsureIndex(indexOpts)
indexOpts.Key = []string{"Obj"}
db.C("triples").EnsureIndex(indexOpts)
indexOpts.Key = []string{"Provenance"}
indexOpts.Key = []string{"Label"}
db.C("triples").EnsureIndex(indexOpts)
return nil
}
func newTripleStore(addr string, options graph.Options) (graph.TripleStore, error) {
var ts TripleStore
var qs TripleStore
conn, err := mgo.Dial(addr)
if err != nil {
return nil, err
@ -76,26 +85,26 @@ func newTripleStore(addr string, options graph.Options) (graph.TripleStore, erro
if val, ok := options.StringKey("database_name"); ok {
dbName = val
}
ts.db = conn.DB(dbName)
ts.session = conn
ts.hasher = sha1.New()
ts.idCache = NewIDLru(1 << 16)
return &ts, nil
qs.db = conn.DB(dbName)
qs.session = conn
qs.hasher = sha1.New()
qs.idCache = NewIDLru(1 << 16)
return &qs, nil
}
func (ts *TripleStore) getIdForTriple(t *graph.Triple) string {
id := ts.ConvertStringToByteHash(t.Subject)
id += ts.ConvertStringToByteHash(t.Predicate)
id += ts.ConvertStringToByteHash(t.Object)
id += ts.ConvertStringToByteHash(t.Provenance)
func (qs *TripleStore) getIdForTriple(t *quad.Quad) string {
id := qs.ConvertStringToByteHash(t.Subject)
id += qs.ConvertStringToByteHash(t.Predicate)
id += qs.ConvertStringToByteHash(t.Object)
id += qs.ConvertStringToByteHash(t.Label)
return id
}
func (ts *TripleStore) ConvertStringToByteHash(s string) string {
ts.hasher.Reset()
key := make([]byte, 0, ts.hasher.Size())
ts.hasher.Write([]byte(s))
key = ts.hasher.Sum(key)
func (qs *TripleStore) ConvertStringToByteHash(s string) string {
qs.hasher.Reset()
key := make([]byte, 0, qs.hasher.Size())
qs.hasher.Write([]byte(s))
key = qs.hasher.Sum(key)
return hex.EncodeToString(key)
}
@ -105,10 +114,10 @@ type MongoNode struct {
Size int "Size"
}
func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
func (qs *TripleStore) updateNodeBy(node_name string, inc int) {
var size MongoNode
node := ts.ValueOf(node_name)
err := ts.db.C("nodes").FindId(node).One(&size)
node := qs.ValueOf(node_name)
err := qs.db.C("nodes").FindId(node).One(&size)
if err != nil {
if err.Error() == "not found" {
// Not found. Okay.
@ -128,7 +137,7 @@ func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
// Removing something...
if inc < 0 {
if size.Size <= 0 {
err := ts.db.C("nodes").RemoveId(node)
err := qs.db.C("nodes").RemoveId(node)
if err != nil {
glog.Error("Error: ", err, " while removing node ", node_name)
return
@ -136,21 +145,21 @@ func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
}
}
_, err2 := ts.db.C("nodes").UpsertId(node, size)
_, err2 := qs.db.C("nodes").UpsertId(node, size)
if err2 != nil {
glog.Error("Error: ", err)
}
}
func (ts *TripleStore) writeTriple(t *graph.Triple) bool {
func (qs *TripleStore) writeTriple(t *quad.Quad) bool {
tripledoc := bson.M{
"_id": ts.getIdForTriple(t),
"Subject": t.Subject,
"Predicate": t.Predicate,
"Object": t.Object,
"Provenance": t.Provenance,
"_id": qs.getIdForTriple(t),
"Subject": t.Subject,
"Predicate": t.Predicate,
"Object": t.Object,
"Label": t.Label,
}
err := ts.db.C("triples").Insert(tripledoc)
err := qs.db.C("triples").Insert(tripledoc)
if err != nil {
// Among the reasons I hate MongoDB. "Errors don't happen! Right guys?"
if err.(*mgo.LastError).Code == 11000 {
@ -162,98 +171,98 @@ func (ts *TripleStore) writeTriple(t *graph.Triple) bool {
return true
}
func (ts *TripleStore) AddTriple(t *graph.Triple) {
_ = ts.writeTriple(t)
ts.updateNodeBy(t.Subject, 1)
ts.updateNodeBy(t.Predicate, 1)
ts.updateNodeBy(t.Object, 1)
if t.Provenance != "" {
ts.updateNodeBy(t.Provenance, 1)
func (qs *TripleStore) AddTriple(t *quad.Quad) {
_ = qs.writeTriple(t)
qs.updateNodeBy(t.Subject, 1)
qs.updateNodeBy(t.Predicate, 1)
qs.updateNodeBy(t.Object, 1)
if t.Label != "" {
qs.updateNodeBy(t.Label, 1)
}
}
func (ts *TripleStore) AddTripleSet(in []*graph.Triple) {
ts.session.SetSafe(nil)
func (qs *TripleStore) AddTripleSet(in []*quad.Quad) {
qs.session.SetSafe(nil)
ids := make(map[string]int)
for _, t := range in {
wrote := ts.writeTriple(t)
wrote := qs.writeTriple(t)
if wrote {
ids[t.Subject]++
ids[t.Object]++
ids[t.Predicate]++
if t.Provenance != "" {
ids[t.Provenance]++
if t.Label != "" {
ids[t.Label]++
}
}
}
for k, v := range ids {
ts.updateNodeBy(k, v)
qs.updateNodeBy(k, v)
}
ts.session.SetSafe(&mgo.Safe{})
qs.session.SetSafe(&mgo.Safe{})
}
func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
err := ts.db.C("triples").RemoveId(ts.getIdForTriple(t))
func (qs *TripleStore) RemoveTriple(t *quad.Quad) {
err := qs.db.C("triples").RemoveId(qs.getIdForTriple(t))
if err == mgo.ErrNotFound {
return
} else if err != nil {
log.Println("Error: ", err, " while removing triple ", t)
return
}
ts.updateNodeBy(t.Subject, -1)
ts.updateNodeBy(t.Predicate, -1)
ts.updateNodeBy(t.Object, -1)
if t.Provenance != "" {
ts.updateNodeBy(t.Provenance, -1)
qs.updateNodeBy(t.Subject, -1)
qs.updateNodeBy(t.Predicate, -1)
qs.updateNodeBy(t.Object, -1)
if t.Label != "" {
qs.updateNodeBy(t.Label, -1)
}
}
func (ts *TripleStore) Triple(val graph.Value) *graph.Triple {
func (qs *TripleStore) Quad(val graph.Value) *quad.Quad {
var bsonDoc bson.M
err := ts.db.C("triples").FindId(val.(string)).One(&bsonDoc)
err := qs.db.C("triples").FindId(val.(string)).One(&bsonDoc)
if err != nil {
log.Println("Error: Couldn't retrieve triple", val.(string), err)
}
return &graph.Triple{
return &quad.Quad{
bsonDoc["Subject"].(string),
bsonDoc["Predicate"].(string),
bsonDoc["Object"].(string),
bsonDoc["Provenance"].(string),
bsonDoc["Label"].(string),
}
}
func (ts *TripleStore) TripleIterator(d graph.Direction, val graph.Value) graph.Iterator {
return NewIterator(ts, "triples", d, val)
func (qs *TripleStore) TripleIterator(d quad.Direction, val graph.Value) graph.Iterator {
return NewIterator(qs, "triples", d, val)
}
func (ts *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator(ts, "nodes")
func (qs *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator(qs, "nodes")
}
func (ts *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator(ts, "triples")
func (qs *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator(qs, "triples")
}
func (ts *TripleStore) ValueOf(s string) graph.Value {
return ts.ConvertStringToByteHash(s)
func (qs *TripleStore) ValueOf(s string) graph.Value {
return qs.ConvertStringToByteHash(s)
}
func (ts *TripleStore) NameOf(v graph.Value) string {
val, ok := ts.idCache.Get(v.(string))
func (qs *TripleStore) NameOf(v graph.Value) string {
val, ok := qs.idCache.Get(v.(string))
if ok {
return val
}
var node MongoNode
err := ts.db.C("nodes").FindId(v.(string)).One(&node)
err := qs.db.C("nodes").FindId(v.(string)).One(&node)
if err != nil {
log.Println("Error: Couldn't retrieve node", v.(string), err)
}
ts.idCache.Put(v.(string), node.Name)
qs.idCache.Put(v.(string), node.Name)
return node.Name
}
func (ts *TripleStore) Size() int64 {
count, err := ts.db.C("triples").Count()
func (qs *TripleStore) Size() int64 {
count, err := qs.db.C("triples").Count()
if err != nil {
glog.Error("Error: ", err)
return 0
@ -265,40 +274,48 @@ func compareStrings(a, b graph.Value) bool {
return a.(string) == b.(string)
}
func (ts *TripleStore) FixedIterator() graph.FixedIterator {
func (qs *TripleStore) FixedIterator() graph.FixedIterator {
return iterator.NewFixedIteratorWithCompare(compareStrings)
}
func (ts *TripleStore) Close() {
ts.db.Session.Close()
func (qs *TripleStore) Close() {
qs.db.Session.Close()
}
func (ts *TripleStore) TripleDirection(in graph.Value, d graph.Direction) graph.Value {
func (qs *TripleStore) TripleDirection(in graph.Value, d quad.Direction) graph.Value {
// Maybe do the trick here
var offset int
switch d {
case graph.Subject:
case quad.Subject:
offset = 0
case graph.Predicate:
offset = (ts.hasher.Size() * 2)
case graph.Object:
offset = (ts.hasher.Size() * 2) * 2
case graph.Provenance:
offset = (ts.hasher.Size() * 2) * 3
case quad.Predicate:
offset = (qs.hasher.Size() * 2)
case quad.Object:
offset = (qs.hasher.Size() * 2) * 2
case quad.Label:
offset = (qs.hasher.Size() * 2) * 3
}
val := in.(string)[offset : ts.hasher.Size()*2+offset]
val := in.(string)[offset : qs.hasher.Size()*2+offset]
return val
}
func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
if ts.Size() != 0 {
return false
func (qs *TripleStore) BulkLoad(dec quad.Unmarshaler) error {
if qs.Size() != 0 {
return graph.ErrCannotBulkLoad
}
ts.session.SetSafe(nil)
for triple := range t_chan {
ts.writeTriple(triple)
qs.session.SetSafe(nil)
for {
q, err := dec.Unmarshal()
if err != nil {
if err != io.EOF {
return err
}
break
}
qs.writeTriple(q)
}
outputTo := bson.M{"replace": "nodes", "sharded": true}
glog.Infoln("Mapreducing")
job := mgo.MapReduce{
@ -311,8 +328,8 @@ func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
emit(s_key, {"_id": s_key, "Name" : this.Subject, "Size" : 1})
emit(p_key, {"_id": p_key, "Name" : this.Predicate, "Size" : 1})
emit(o_key, {"_id": o_key, "Name" : this.Object, "Size" : 1})
if (this.Provenance != "") {
emit(c_key, {"_id": c_key, "Name" : this.Provenance, "Size" : 1})
if (this.Label != "") {
emit(c_key, {"_id": c_key, "Name" : this.Label, "Size" : 1})
}
}
`,
@ -330,16 +347,13 @@ func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
`,
Out: outputTo,
}
ts.db.C("triples").Find(nil).MapReduce(&job, nil)
qs.db.C("triples").Find(nil).MapReduce(&job, nil)
glog.Infoln("Fixing")
ts.db.Run(bson.D{{"eval", `function() { db.nodes.find().forEach(function (result) {
qs.db.Run(bson.D{{"eval", `function() { db.nodes.find().forEach(function (result) {
db.nodes.update({"_id": result._id}, result.value)
}) }`}, {"args", bson.D{}}}, nil)
ts.session.SetSafe(&mgo.Safe{})
return true
}
qs.session.SetSafe(&mgo.Safe{})
func init() {
graph.RegisterTripleStore("mongo", newTripleStore, createNewMongoGraph)
return nil
}

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
it.Close()
return newIt, true

View file

@ -40,7 +40,7 @@ func (t *ResultTree) AddSubtree(sub *ResultTree) {
t.subtrees = append(t.subtrees, sub)
}
func StringResultTreeEvaluator(it Iterator) string {
func StringResultTreeEvaluator(it Nexter) string {
ok := true
out := ""
for {
@ -59,6 +59,6 @@ func StringResultTreeEvaluator(it Iterator) string {
return out
}
func PrintResultTreeEvaluator(it Iterator) {
func PrintResultTreeEvaluator(it Nexter) {
fmt.Print(StringResultTreeEvaluator(it))
}

View file

@ -19,6 +19,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func BuildIteratorTreeForQuery(ts graph.TripleStore, query string) graph.Iterator {
@ -189,7 +190,7 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
nodeID := getIdentString(tree)
if tree.Children[0].Name == "Variable" {
allIt := ts.NodesAllIterator()
allIt.AddTag(nodeID)
allIt.Tagger().Add(nodeID)
out = allIt
} else {
n := nodeID
@ -208,7 +209,7 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
i++
}
it := buildIteratorTree(tree.Children[i], ts)
lto := iterator.NewLinksTo(ts, it, graph.Predicate)
lto := iterator.NewLinksTo(ts, it, quad.Predicate)
return lto
case "RootConstraint":
constraintCount := 0
@ -229,16 +230,16 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
return and
case "Constraint":
var hasa *iterator.HasA
topLevelDir := graph.Subject
subItDir := graph.Object
topLevelDir := quad.Subject
subItDir := quad.Object
subAnd := iterator.NewAnd()
isOptional := false
for _, c := range tree.Children {
switch c.Name {
case "PredIdentifier":
if c.Children[0].Name == "Reverse" {
topLevelDir = graph.Object
subItDir = graph.Subject
topLevelDir = quad.Object
subItDir = quad.Subject
}
it := buildIteratorTree(c, ts)
subAnd.AddSubIterator(it)

View file

@ -18,6 +18,8 @@ import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
_ "github.com/google/cayley/graph/memstore"
)
@ -30,21 +32,21 @@ func TestBadParse(t *testing.T) {
var testQueries = []struct {
message string
add *graph.Triple
add *quad.Quad
query string
typ graph.Type
expect string
}{
{
message: "get a single triple linkage",
add: &graph.Triple{"i", "can", "win", ""},
add: &quad.Quad{"i", "can", "win", ""},
query: "($a (:can \"win\"))",
typ: graph.And,
expect: "i",
},
{
message: "get a single triple linkage",
add: &graph.Triple{"i", "can", "win", ""},
add: &quad.Quad{"i", "can", "win", ""},
query: "(\"i\" (:can $a))",
typ: graph.And,
expect: "i",
@ -65,7 +67,7 @@ func TestMemstoreBackedSexp(t *testing.T) {
if it.Type() != test.typ {
t.Errorf("Incorrect type for %s, got:%q expect %q", test.message, it.Type(), test.expect)
}
got, ok := it.Next()
got, ok := graph.Next(it)
if !ok {
t.Errorf("Failed to %s", test.message)
}
@ -77,8 +79,8 @@ func TestMemstoreBackedSexp(t *testing.T) {
func TestTreeConstraintParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
ts.AddTriple(&graph.Triple{"i", "like", "food", ""})
ts.AddTriple(&graph.Triple{"food", "is", "good", ""})
ts.AddTriple(&quad.Quad{"i", "like", "food", ""})
ts.AddTriple(&quad.Quad{"food", "is", "good", ""})
query := "(\"i\"\n" +
"(:like\n" +
"($a (:is :good))))"
@ -86,7 +88,7 @@ func TestTreeConstraintParse(t *testing.T) {
if it.Type() != graph.And {
t.Error("Odd iterator tree. Got: %s", it.DebugString(0))
}
out, ok := it.Next()
out, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
@ -97,13 +99,13 @@ func TestTreeConstraintParse(t *testing.T) {
func TestTreeConstraintTagParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
ts.AddTriple(&graph.Triple{"i", "like", "food", ""})
ts.AddTriple(&graph.Triple{"food", "is", "good", ""})
ts.AddTriple(&quad.Quad{"i", "like", "food", ""})
ts.AddTriple(&quad.Quad{"food", "is", "good", ""})
query := "(\"i\"\n" +
"(:like\n" +
"($a (:is :good))))"
it := BuildIteratorTreeForQuery(ts, query)
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
@ -117,7 +119,7 @@ func TestTreeConstraintTagParse(t *testing.T) {
func TestMultipleConstraintParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, tv := range []*graph.Triple{
for _, tv := range []*quad.Quad{
{"i", "like", "food", ""},
{"i", "like", "beer", ""},
{"you", "like", "beer", ""},
@ -133,14 +135,14 @@ func TestMultipleConstraintParse(t *testing.T) {
if it.Type() != graph.And {
t.Error("Odd iterator tree. Got: %s", it.DebugString(0))
}
out, ok := it.Next()
out, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
if out != ts.ValueOf("i") {
t.Errorf("Got %d, expected %d", out, ts.ValueOf("i"))
}
_, ok = it.Next()
_, ok = graph.Next(it)
if ok {
t.Error("Too many results")
}

View file

@ -77,7 +77,7 @@ func (s *Session) ExecInput(input string, out chan interface{}, limit int) {
}
nResults := 0
for {
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}

View file

@ -23,7 +23,9 @@ package graph
import (
"errors"
"github.com/barakmich/glog"
"github.com/google/cayley/quad"
)
// Defines an opaque "triple store value" type. However the backend wishes to
@ -38,21 +40,21 @@ type Value interface{}
type TripleStore interface {
// Add a triple to the store.
AddTriple(*Triple)
AddTriple(*quad.Quad)
// Add a set of triples to the store, atomically if possible.
AddTripleSet([]*Triple)
AddTripleSet([]*quad.Quad)
// Removes a triple matching the given one from the database,
// if it exists. Does nothing otherwise.
RemoveTriple(*Triple)
RemoveTriple(*quad.Quad)
// Given an opaque token, returns the triple for that token from the store.
Triple(Value) *Triple
Quad(Value) *quad.Quad
// Given a direction and a token, creates an iterator of links which have
// that node token in that directional field.
TripleIterator(Direction, Value) Iterator
TripleIterator(quad.Direction, Value) Iterator
// Returns an iterator enumerating all nodes in the graph.
NodesAllIterator() Iterator
@ -89,8 +91,8 @@ type TripleStore interface {
// gives the TripleStore the opportunity to make this optimization.
//
// Iterators will call this. At worst, a valid implementation is
// ts.IdFor(ts.Triple(triple_id).Get(dir))
TripleDirection(triple_id Value, d Direction) Value
// ts.IdFor(ts.quad.Quad(id).Get(dir))
TripleDirection(id Value, d quad.Direction) Value
}
type Options map[string]interface{}
@ -122,14 +124,10 @@ func (d Options) StringKey(key string) (string, bool) {
var ErrCannotBulkLoad = errors.New("triplestore: cannot bulk load")
type BulkLoader interface {
// BulkLoad loads Triples from a TripleUnmarshaler in bulk to the TripleStore.
// BulkLoad loads Quads from a quad.Unmarshaler in bulk to the TripleStore.
// It returns ErrCannotBulkLoad if bulk loading is not possible. For example if
// you cannot load in bulk to a non-empty database, and the db is non-empty.
BulkLoad(TripleUnmarshaler) error
}
type TripleUnmarshaler interface {
Unmarshal() (*Triple, error)
BulkLoad(quad.Unmarshaler) error
}
type NewStoreFunc func(string, Options) (TripleStore, error)

View file

@ -19,22 +19,22 @@ import (
"reflect"
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
var parseTests = []struct {
message string
input string
expect []*graph.Triple
expect []*quad.Quad
err error
}{
{
message: "parse correct JSON",
input: `[
{"subject": "foo", "predicate": "bar", "object": "baz"},
{"subject": "foo", "predicate": "bar", "object": "baz", "provenance": "graph"}
{"subject": "foo", "predicate": "bar", "object": "baz", "label": "graph"}
]`,
expect: []*graph.Triple{
expect: []*quad.Quad{
{"foo", "bar", "baz", ""},
{"foo", "bar", "baz", "graph"},
},
@ -45,7 +45,7 @@ var parseTests = []struct {
input: `[
{"subject": "foo", "predicate": "bar", "object": "foo", "something_else": "extra data"}
]`,
expect: []*graph.Triple{
expect: []*quad.Quad{
{"foo", "bar", "foo", ""},
},
err: nil,
@ -56,7 +56,7 @@ var parseTests = []struct {
{"subject": "foo", "predicate": "bar"}
]`,
expect: nil,
err: fmt.Errorf("Invalid triple at index %d. %v", 0, &graph.Triple{"foo", "bar", "", ""}),
err: fmt.Errorf("Invalid triple at index %d. %v", 0, &quad.Quad{"foo", "bar", "", ""}),
},
}

View file

@ -25,12 +25,12 @@ import (
"github.com/barakmich/glog"
"github.com/julienschmidt/httprouter"
"github.com/google/cayley/graph"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad"
"github.com/google/cayley/quad/nquads"
)
func ParseJsonToTripleList(jsonBody []byte) ([]*graph.Triple, error) {
var tripleList []*graph.Triple
func ParseJsonToTripleList(jsonBody []byte) ([]*quad.Quad, error) {
var tripleList []*quad.Quad
err := json.Unmarshal(jsonBody, &tripleList)
if err != nil {
return nil, err
@ -83,7 +83,7 @@ func (api *Api) ServeV1WriteNQuad(w http.ResponseWriter, r *http.Request, params
var (
n int
block = make([]*graph.Triple, 0, blockSize)
block = make([]*quad.Quad, 0, blockSize)
)
for {
t, err := dec.Unmarshal()

View file

@ -1,138 +0,0 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package nquads
import (
"reflect"
"testing"
"github.com/google/cayley/graph"
)
var testNTriples = []struct {
message string
input string
expect *graph.Triple
err error
}{
// NTriple tests.
{
message: "not parse invalid triples",
input: "invalid",
expect: nil,
err: ErrAbsentPredicate,
},
{
message: "invalid internal quote",
input: `":103032" "/film/performance/character" "Walter "Teacher" Cole" .`,
expect: nil,
err: ErrUnterminated,
},
{
message: "not parse comments",
input: "# nominally valid triple .",
expect: nil,
err: nil,
},
{
message: "parse simple triples",
input: "this is valid .",
expect: &graph.Triple{"this", "is", "valid", ""},
},
{
message: "parse quoted triples",
input: `this is "valid too" .`,
expect: &graph.Triple{"this", "is", "valid too", ""},
},
{
message: "parse escaped quoted triples",
input: `he said "\"That's all folks\"" .`,
expect: &graph.Triple{"he", "said", `"That's all folks"`, ""},
},
{
message: "parse an example real triple",
input: `":/guid/9202a8c04000641f80000000010c843c" "name" "George Morris" .`,
expect: &graph.Triple{":/guid/9202a8c04000641f80000000010c843c", "name", "George Morris", ""},
},
{
message: "parse a pathologically spaced triple",
input: "foo is \"\\tA big tough\\r\\nDeal\\\\\" .",
expect: &graph.Triple{"foo", "is", "\tA big tough\r\nDeal\\", ""},
},
// NQuad tests.
{
message: "parse a simple quad",
input: "this is valid quad .",
expect: &graph.Triple{"this", "is", "valid", "quad"},
},
{
message: "parse a quoted quad",
input: `this is valid "quad thing" .`,
expect: &graph.Triple{"this", "is", "valid", "quad thing"},
},
{
message: "parse crazy escaped quads",
input: `"\"this" "\"is" "\"valid" "\"quad thing".`,
expect: &graph.Triple{`"this`, `"is`, `"valid`, `"quad thing`},
},
// NTriple official tests.
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> <http://example/o> . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "http://example/o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> _:o . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "_:o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\" . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\"^^<http://example/dt> . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\"@en . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
}
func TestParse(t *testing.T) {
for _, test := range testNTriples {
got, err := Parse(test.input)
if err != test.err {
t.Errorf("Unexpected error when %s: got:%v expect:%v", test.message, err, test.err)
}
if !reflect.DeepEqual(got, test.expect) {
t.Errorf("Failed to %s, %q, got:%q expect:%q", test.message, test.input, got, test.expect)
}
}
}
var result *graph.Triple
func BenchmarkParser(b *testing.B) {
for n := 0; n < b.N; n++ {
result, _ = Parse("<http://example/s> <http://example/p> \"object of some real\\tlength\"@en . # comment")
}
}

95
quad/cquads/actions.rl Normal file
View file

@ -0,0 +1,95 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
%%{
machine quads;
action Escape {
isEscaped = true
}
action Quote {
isQuoted = true
}
action StartSubject {
subject = p
}
action StartPredicate {
predicate = p
}
action StartObject {
object = p
}
action StartLabel {
label = p
}
action SetSubject {
if subject < 0 {
panic("unexpected parser state: subject start not set")
}
q.Subject = unEscape(data[subject:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetPredicate {
if predicate < 0 {
panic("unexpected parser state: predicate start not set")
}
q.Predicate = unEscape(data[predicate:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetObject {
if object < 0 {
panic("unexpected parser state: object start not set")
}
q.Object = unEscape(data[object:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetLabel {
if label < 0 {
panic("unexpected parser state: label start not set")
}
q.Label = unEscape(data[label:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action Return {
return q, nil
}
action Comment {
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return q, fmt.Errorf("%v: unexpected rune %q at %d", quad.ErrInvalid, data[p], p)
} else {
return q, fmt.Errorf("%v: unexpected rune %q (\\u%04x) at %d", quad.ErrInvalid, data[p], data[p], p)
}
}
return q, quad.ErrIncomplete
}
}%%

144
quad/cquads/cquads.go Normal file
View file

@ -0,0 +1,144 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Package cquads implements parsing N-Quads like line-based syntax
// for RDF datasets.
//
// N-Quad parsing is performed as based on a simplified grammar derived from
// the N-Quads grammar defined by http://www.w3.org/TR/n-quads/.
//
// For a complete definition of the grammar, see cquads.rl.
package cquads
import (
"bufio"
"bytes"
"fmt"
"io"
"strconv"
"github.com/google/cayley/quad"
)
// Parse returns a valid quad.Quad or a non-nil error. Parse does
// handle comments except where the comment placement does not prevent
// a complete valid quad.Quad from being defined.
func Parse(str string) (*quad.Quad, error) {
q, err := parse([]rune(str))
return &q, err
}
// Decoder implements simplified N-Quad document parsing.
type Decoder struct {
r *bufio.Reader
line []byte
}
// NewDecoder returns an N-Quad decoder that takes its input from the
// provided io.Reader.
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{r: bufio.NewReader(r)}
}
// Unmarshal returns the next valid N-Quad as a quad.Quad, or an error.
func (dec *Decoder) Unmarshal() (*quad.Quad, error) {
dec.line = dec.line[:0]
var line []byte
for {
for {
l, pre, err := dec.r.ReadLine()
if err != nil {
return nil, err
}
dec.line = append(dec.line, l...)
if !pre {
break
}
}
if line = bytes.TrimSpace(dec.line); len(line) != 0 && line[0] != '#' {
break
}
dec.line = dec.line[:0]
}
triple, err := Parse(string(line))
if err != nil {
return nil, fmt.Errorf("failed to parse %q: %v", dec.line, err)
}
if triple == nil {
return dec.Unmarshal()
}
return triple, nil
}
func unEscape(r []rune, isQuoted, isEscaped bool) string {
if isQuoted {
r = r[1 : len(r)-1]
}
if len(r) >= 2 && r[0] == '<' && r[len(r)-1] == '>' {
return string(r[1 : len(r)-1])
}
if !isEscaped {
return string(r)
}
buf := bytes.NewBuffer(make([]byte, 0, len(r)))
for i := 0; i < len(r); {
switch r[i] {
case '\\':
i++
var c byte
switch r[i] {
case 't':
c = '\t'
case 'b':
c = '\b'
case 'n':
c = '\n'
case 'r':
c = '\r'
case 'f':
c = '\f'
case '"':
c = '"'
case '\'':
c = '\''
case '\\':
c = '\\'
case 'u':
rc, err := strconv.ParseInt(string(r[i+1:i+5]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 5
continue
case 'U':
rc, err := strconv.ParseInt(string(r[i+1:i+9]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 9
continue
}
buf.WriteByte(c)
default:
buf.WriteRune(r[i])
}
i++
}
return buf.String()
}

106
quad/cquads/cquads.rl Normal file
View file

@ -0,0 +1,106 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Ragel gramar definition derived from http://www.w3.org/TR/n-quads/#sec-grammar.
%%{
machine quads;
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
ECHAR = ('\\' [tbnrf"'\\]) %Escape ;
UCHAR = ('\\u' xdigit {4}
| '\\U' xdigit {8}) %Escape
;
BLANK_NODE_LABEL = '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
STRING_LITERAL = (
'!'
| '#' .. '['
| ']' .. 0x7e
| 0x80 .. 0x10ffff
| ECHAR
| UCHAR)+ - ('_:' | any* '.' | '#' any*)
;
STRING_LITERAL_QUOTE = '"' (
0x00 .. 0x09
| 0x0b .. 0x0c
| 0x0e .. '!'
| '#' .. '['
| ']' .. 0x10ffff
| ECHAR
| UCHAR)*
'"'
;
IRIREF = '<' (
'!' .. ';'
| '='
| '?' .. '['
| ']'
| '_'
| 'a' .. 'z'
| '~'
| 0x80 .. 0x10ffff
| UCHAR)*
'>'
;
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
whitespace = [ \t] ;
literal = STRING_LITERAL | STRING_LITERAL_QUOTE % Quote | STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG) ;
subject = (literal | BLANK_NODE_LABEL) ;
predicate = literal ;
object = (literal | BLANK_NODE_LABEL) ;
graphLabel = (literal | BLANK_NODE_LABEL) ;
statement := (
whitespace* subject >StartSubject %SetSubject
whitespace+ predicate >StartPredicate %SetPredicate
whitespace+ object >StartObject %SetObject
(whitespace+ graphLabel >StartLabel %SetLabel)?
whitespace* '.' whitespace* ('#' any*)? >Comment
) %Return @!Error ;
}%%

782
quad/cquads/cquads_test.go Normal file

File diff suppressed because it is too large Load diff

6692
quad/cquads/parse.go Normal file

File diff suppressed because it is too large Load diff

58
quad/cquads/parse.rl Normal file
View file

@ -0,0 +1,58 @@
// GO SOURCE FILE MACHINE GENERATED BY RAGEL; DO NOT EDIT
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package cquads
import (
"fmt"
"unicode"
"github.com/google/cayley/quad"
)
%%{
machine quads;
include "actions.rl";
include "cquads.rl";
write data;
}%%
func parse(data []rune) (quad.Quad, error) {
var (
cs, p int
pe = len(data)
eof = pe
subject = -1
predicate = -1
object = -1
label = -1
isEscaped bool
isQuoted bool
q quad.Quad
)
%%write init;
%%write exec;
return quad.Quad{}, quad.ErrInvalid
}

BIN
quad/nquad_tests.tar.gz Normal file

Binary file not shown.

87
quad/nquads/actions.rl Normal file
View file

@ -0,0 +1,87 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
%%{
machine quads;
action Escape {
isEscaped = true
}
action StartSubject {
subject = p
}
action StartPredicate {
predicate = p
}
action StartObject {
object = p
}
action StartLabel {
label = p
}
action SetSubject {
if subject < 0 {
panic("unexpected parser state: subject start not set")
}
q.Subject = unEscape(data[subject:p], isEscaped)
isEscaped = false
}
action SetPredicate {
if predicate < 0 {
panic("unexpected parser state: predicate start not set")
}
q.Predicate = unEscape(data[predicate:p], isEscaped)
isEscaped = false
}
action SetObject {
if object < 0 {
panic("unexpected parser state: object start not set")
}
q.Object = unEscape(data[object:p], isEscaped)
isEscaped = false
}
action SetLabel {
if label < 0 {
panic("unexpected parser state: label start not set")
}
q.Label = unEscape(data[label:p], isEscaped)
isEscaped = false
}
action Return {
return q, nil
}
action Comment {
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return q, fmt.Errorf("%v: unexpected rune %q at %d", quad.ErrInvalid, data[p], p)
} else {
return q, fmt.Errorf("%v: unexpected rune %q (\\u%04x) at %d", quad.ErrInvalid, data[p], data[p], p)
}
}
return q, quad.ErrIncomplete
}
}%%

138
quad/nquads/nquads.go Normal file
View file

@ -0,0 +1,138 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Package nquads implements parsing the RDF 1.1 N-Quads line-based syntax
// for RDF datasets.
//
// N-Quad parsing is performed as defined by http://www.w3.org/TR/n-quads/
// with the exception that the nquads package will allow relative IRI values,
// which are prohibited by the N-Quads quad-Quads specifications.
package nquads
import (
"bufio"
"bytes"
"fmt"
"io"
"strconv"
"github.com/google/cayley/quad"
)
// Parse returns a valid quad.Quad or a non-nil error. Parse does
// handle comments except where the comment placement does not prevent
// a complete valid quad.Quad from being defined.
func Parse(str string) (*quad.Quad, error) {
q, err := parse([]rune(str))
return &q, err
}
// Decoder implements N-Quad document parsing according to the RDF
// 1.1 N-Quads specification.
type Decoder struct {
r *bufio.Reader
line []byte
}
// NewDecoder returns an N-Quad decoder that takes its input from the
// provided io.Reader.
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{r: bufio.NewReader(r)}
}
// Unmarshal returns the next valid N-Quad as a quad.Quad, or an error.
func (dec *Decoder) Unmarshal() (*quad.Quad, error) {
dec.line = dec.line[:0]
var line []byte
for {
for {
l, pre, err := dec.r.ReadLine()
if err != nil {
return nil, err
}
dec.line = append(dec.line, l...)
if !pre {
break
}
}
if line = bytes.TrimSpace(dec.line); len(line) != 0 && line[0] != '#' {
break
}
dec.line = dec.line[:0]
}
triple, err := Parse(string(line))
if err != nil {
return nil, fmt.Errorf("failed to parse %q: %v", dec.line, err)
}
if triple == nil {
return dec.Unmarshal()
}
return triple, nil
}
func unEscape(r []rune, isEscaped bool) string {
if !isEscaped {
return string(r)
}
buf := bytes.NewBuffer(make([]byte, 0, len(r)))
for i := 0; i < len(r); {
switch r[i] {
case '\\':
i++
var c byte
switch r[i] {
case 't':
c = '\t'
case 'b':
c = '\b'
case 'n':
c = '\n'
case 'r':
c = '\r'
case 'f':
c = '\f'
case '"':
c = '"'
case '\'':
c = '\''
case '\\':
c = '\\'
case 'u':
rc, err := strconv.ParseInt(string(r[i+1:i+5]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 5
continue
case 'U':
rc, err := strconv.ParseInt(string(r[i+1:i+9]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 9
continue
}
buf.WriteByte(c)
default:
buf.WriteRune(r[i])
}
i++
}
return buf.String()
}

97
quad/nquads/nquads.rl Normal file
View file

@ -0,0 +1,97 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Ragel gramar definition derived from http://www.w3.org/TR/n-quads/#sec-grammar.
%%{
machine quads;
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
ECHAR = ('\\' [tbnrf"'\\]) %Escape ;
UCHAR = ('\\u' xdigit {4}
| '\\U' xdigit {8}) %Escape
;
BLANK_NODE_LABEL = '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
STRING_LITERAL_QUOTE = '"' (
0x00 .. 0x09
| 0x0b .. 0x0c
| 0x0e .. '!'
| '#' .. '['
| ']' .. 0x10ffff
| ECHAR
| UCHAR)*
'"'
;
IRIREF = '<' (
'!' .. ';'
| '='
| '?' .. '['
| ']'
| '_'
| 'a' .. 'z'
| '~'
| 0x80 .. 0x10ffff
| UCHAR)*
'>'
;
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
whitespace = [ \t] ;
literal = STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG)? ;
subject = IRIREF | BLANK_NODE_LABEL ;
predicate = IRIREF ;
object = IRIREF | BLANK_NODE_LABEL | literal ;
graphLabel = IRIREF | BLANK_NODE_LABEL ;
statement := (
whitespace* subject >StartSubject %SetSubject
whitespace* predicate >StartPredicate %SetPredicate
whitespace* object >StartObject %SetObject
(whitespace* graphLabel >StartLabel %SetLabel)?
whitespace* '.' whitespace* ('#' any*)? >Comment
) %Return @!Error ;
}%%

589
quad/nquads/nquads_test.go Normal file

File diff suppressed because it is too large Load diff

3652
quad/nquads/parse.go Normal file

File diff suppressed because it is too large Load diff

57
quad/nquads/parse.rl Normal file
View file

@ -0,0 +1,57 @@
// GO SOURCE FILE MACHINE GENERATED BY RAGEL; DO NOT EDIT
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package nquads
import (
"fmt"
"unicode"
"github.com/google/cayley/quad"
)
%%{
machine quads;
include "actions.rl";
include "nquads.rl";
write data;
}%%
func parse(data []rune) (quad.Quad, error) {
var (
cs, p int
pe = len(data)
eof = pe
subject = -1
predicate = -1
object = -1
label = -1
isEscaped bool
q quad.Quad
)
%%write init;
%%write exec;
return quad.Quad{}, quad.ErrInvalid
}

View file

@ -1,29 +1,52 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// +build ignore
package nquads
package main
import (
"bufio"
"errors"
"fmt"
"io"
"log"
"os"
"strings"
"github.com/google/cayley/graph"
)
func main() {
dec := NewDecoder(os.Stdin)
for {
t, err := dec.Unmarshal()
if err != nil {
if err == io.EOF {
return
}
log.Println(err)
}
if t.Subject[0] == ':' && t.Subject[1] == '/' {
t.Subject = "<" + t.Subject[1:] + ">"
} else {
t.Subject = "_" + t.Subject
}
if t.Object[0] == ':' {
if t.Object[1] == '/' {
t.Object = "<" + t.Object[1:] + ">"
} else {
t.Object = "_" + t.Object
}
} else if t.Object[0] == '/' {
t.Object = "<" + t.Object + ">"
} else {
t.Object = fmt.Sprintf(`%q`, t.Object)
}
fmt.Printf("%s <%s> %s .\n", t.Subject, t.Predicate, t.Object)
}
}
// Historical N-Quads parser code.
// -------------------------------
var (
ErrAbsentSubject = errors.New("nqauds: absent subject")
ErrAbsentPredicate = errors.New("nqauds: absent predicate")

BIN
quad/ntriple_tests.tar.gz Normal file

Binary file not shown.

View file

@ -12,7 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.
package graph
// Package quad defines quad and triple handling.
package quad
// Defines the struct which makes the TripleStore possible -- the triple.
//
@ -25,7 +26,7 @@ package graph
// list of triples. The rest is just indexing for speed.
//
// Adding fields to the triple is not to be taken lightly. You'll see I mention
// provenance, but don't as yet use it in any backing store. In general, there
// label, but don't as yet use it in any backing store. In general, there
// can be features that can be turned on or off for any store, but I haven't
// decided how to allow/disallow them yet. Another such example would be to add
// a forward and reverse index field -- forward being "order the list of
@ -35,17 +36,22 @@ package graph
// There will never be that much in this file except for the definition, but
// the consequences are not to be taken lightly. But do suggest cool features!
import "fmt"
import (
"errors"
"fmt"
)
// TODO(kortschak) Consider providing MashalJSON and UnmarshalJSON
// instead of using struct tags.
var (
ErrInvalid = errors.New("invalid N-Quad")
ErrIncomplete = errors.New("incomplete N-Quad")
)
// Our triple struct, used throughout.
type Triple struct {
Subject string `json:"subject"`
Predicate string `json:"predicate"`
Object string `json:"object"`
Provenance string `json:"provenance,omitempty"`
type Quad struct {
Subject string `json:"subject"`
Predicate string `json:"predicate"`
Object string `json:"object"`
Label string `json:"label,omitempty"`
}
// Direction specifies an edge's type.
@ -57,7 +63,7 @@ const (
Subject
Predicate
Object
Provenance
Label
)
func (d Direction) Prefix() byte {
@ -68,7 +74,7 @@ func (d Direction) Prefix() byte {
return 's'
case Predicate:
return 'p'
case Provenance:
case Label:
return 'c'
case Object:
return 'o'
@ -85,8 +91,8 @@ func (d Direction) String() string {
return "subject"
case Predicate:
return "predicate"
case Provenance:
return "provenance"
case Label:
return "label"
case Object:
return "object"
default:
@ -98,45 +104,49 @@ func (d Direction) String() string {
// instead of the pointer. This needs benchmarking to make the decision.
// Per-field accessor for triples
func (t *Triple) Get(d Direction) string {
func (q *Quad) Get(d Direction) string {
switch d {
case Subject:
return t.Subject
return q.Subject
case Predicate:
return t.Predicate
case Provenance:
return t.Provenance
return q.Predicate
case Label:
return q.Label
case Object:
return t.Object
return q.Object
default:
panic(d.String())
}
}
func (t *Triple) Equals(o *Triple) bool {
return *t == *o
func (q *Quad) Equals(o *Quad) bool {
return *q == *o
}
// Pretty-prints a triple.
func (t *Triple) String() string {
func (q *Quad) String() string {
// TODO(kortschak) String methods should generally not terminate in '\n'.
return fmt.Sprintf("%s -- %s -> %s\n", t.Subject, t.Predicate, t.Object)
return fmt.Sprintf("%s -- %s -> %s\n", q.Subject, q.Predicate, q.Object)
}
func (t *Triple) IsValid() bool {
return t.Subject != "" && t.Predicate != "" && t.Object != ""
func (q *Quad) IsValid() bool {
return q.Subject != "" && q.Predicate != "" && q.Object != ""
}
// TODO(kortschak) NTriple looks like a good candidate for conversion
// to MarshalText() (text []byte, err error) and then move parsing code
// from nquads to here to provide UnmarshalText(text []byte) error.
// Prints a triple in N-Triple format.
func (t *Triple) NTriple() string {
if t.Provenance == "" {
// Prints a triple in N-Quad format.
func (q *Quad) NTriple() string {
if q.Label == "" {
//TODO(barakmich): Proper escaping.
return fmt.Sprintf("%s %s %s .", t.Subject, t.Predicate, t.Object)
return fmt.Sprintf("%s %s %s .", q.Subject, q.Predicate, q.Object)
} else {
return fmt.Sprintf("%s %s %s %s .", t.Subject, t.Predicate, t.Object, t.Provenance)
return fmt.Sprintf("%s %s %s %s .", q.Subject, q.Predicate, q.Object, q.Label)
}
}
type Unmarshaler interface {
Unmarshal() (*Quad, error)
}

View file

@ -22,6 +22,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func getStrings(obj *otto.Object, field string) []string {
@ -135,17 +136,17 @@ func buildInOutIterator(obj *otto.Object, ts graph.TripleStore, base graph.Itera
tags = makeListOfStringsFromArrayValue(one.Object())
}
for _, tag := range tags {
predicateNodeIterator.AddTag(tag)
predicateNodeIterator.Tagger().Add(tag)
}
}
in, out := graph.Subject, graph.Object
in, out := quad.Subject, quad.Object
if isReverse {
in, out = out, in
}
lto := iterator.NewLinksTo(ts, base, in)
and := iterator.NewAnd()
and.AddSubIterator(iterator.NewLinksTo(ts, predicateNodeIterator, graph.Predicate))
and.AddSubIterator(iterator.NewLinksTo(ts, predicateNodeIterator, quad.Predicate))
and.AddSubIterator(lto)
return iterator.NewHasA(ts, and, out)
}
@ -179,7 +180,7 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
case "tag":
it = subIt
for _, tag := range stringArgs {
it.AddTag(tag)
it.Tagger().Add(tag)
}
case "save":
all := ts.NodesAllIterator()
@ -187,16 +188,16 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
return iterator.NewNull()
}
if len(stringArgs) == 2 {
all.AddTag(stringArgs[1])
all.Tagger().Add(stringArgs[1])
} else {
all.AddTag(stringArgs[0])
all.Tagger().Add(stringArgs[0])
}
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Object))
hasa := iterator.NewHasA(ts, subAnd, graph.Subject)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Object))
hasa := iterator.NewHasA(ts, subAnd, quad.Subject)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)
@ -207,16 +208,16 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
return iterator.NewNull()
}
if len(stringArgs) == 2 {
all.AddTag(stringArgs[1])
all.Tagger().Add(stringArgs[1])
} else {
all.AddTag(stringArgs[0])
all.Tagger().Add(stringArgs[0])
}
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Subject))
hasa := iterator.NewHasA(ts, subAnd, graph.Object)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Subject))
hasa := iterator.NewHasA(ts, subAnd, quad.Object)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)
@ -232,9 +233,9 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, graph.Object))
hasa := iterator.NewHasA(ts, subAnd, graph.Subject)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, quad.Object))
hasa := iterator.NewHasA(ts, subAnd, quad.Subject)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)

View file

@ -38,7 +38,7 @@ func embedFinals(env *otto.Otto, ses *Session, obj *otto.Object) {
func allFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
ses.limit = -1
ses.count = 0
runIteratorOnSession(it, ses)
@ -51,7 +51,7 @@ func limitFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.Functio
if len(call.ArgumentList) > 0 {
limitVal, _ := call.Argument(0).ToInteger()
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
ses.limit = int(limitVal)
ses.count = 0
runIteratorOnSession(it, ses)
@ -63,7 +63,7 @@ func limitFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.Functio
func toArrayFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := -1
if len(call.ArgumentList) > 0 {
limitParsed, _ := call.Argument(0).ToInteger()
@ -90,7 +90,7 @@ func toArrayFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool)
func toValueFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := 1
var val otto.Value
var err error
@ -120,7 +120,7 @@ func toValueFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool)
func mapFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := -1
if len(call.ArgumentList) == 0 {
return otto.NullValue()
@ -151,7 +151,7 @@ func runIteratorToArray(it graph.Iterator, ses *Session, limit int) []map[string
if ses.doHalt {
return nil
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
@ -187,7 +187,7 @@ func runIteratorToArrayNoTags(it graph.Iterator, ses *Session, limit int) []stri
if ses.doHalt {
return nil
}
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
@ -208,7 +208,7 @@ func runIteratorWithCallback(it graph.Iterator, ses *Session, callback otto.Valu
if ses.doHalt {
return
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
@ -249,7 +249,7 @@ func runIteratorOnSession(it graph.Iterator, ses *Session) {
if ses.doHalt {
return
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}

View file

@ -21,6 +21,7 @@ import (
"github.com/google/cayley/graph"
_ "github.com/google/cayley/graph/memstore"
"github.com/google/cayley/quad"
)
// This is a simple test graph.
@ -36,7 +37,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +51,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestSession(data []*graph.Triple) *Session {
func makeTestSession(data []*quad.Quad) *Session {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, t := range data {
ts.AddTriple(t)
@ -244,7 +245,7 @@ var testQueries = []struct {
},
}
func runQueryGetTag(g []*graph.Triple, query string, tag string) []string {
func runQueryGetTag(g []*quad.Quad, query string, tag string) []string {
js := makeTestSession(g)
c := make(chan interface{}, 5)
js.ExecInput(query, c, -1)

View file

@ -23,6 +23,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func (q *Query) buildFixed(s string) graph.Iterator {
@ -33,7 +34,7 @@ func (q *Query) buildFixed(s string) graph.Iterator {
func (q *Query) buildResultIterator(path Path) graph.Iterator {
all := q.ses.ts.NodesAllIterator()
all.AddTag(string(path))
all.Tagger().Add(string(path))
return all
}
@ -97,7 +98,7 @@ func (q *Query) buildIteratorTreeInternal(query interface{}, path Path) (it grap
if err != nil {
return nil, false, err
}
it.AddTag(string(path))
it.Tagger().Add(string(path))
return it, optional, nil
}
@ -139,16 +140,16 @@ func (q *Query) buildIteratorTreeMapInternal(query map[string]interface{}, path
subAnd := iterator.NewAnd()
predFixed := q.ses.ts.FixedIterator()
predFixed.Add(q.ses.ts.ValueOf(pred))
subAnd.AddSubIterator(iterator.NewLinksTo(q.ses.ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(q.ses.ts, predFixed, quad.Predicate))
if reverse {
lto := iterator.NewLinksTo(q.ses.ts, builtIt, graph.Subject)
lto := iterator.NewLinksTo(q.ses.ts, builtIt, quad.Subject)
subAnd.AddSubIterator(lto)
hasa := iterator.NewHasA(q.ses.ts, subAnd, graph.Object)
hasa := iterator.NewHasA(q.ses.ts, subAnd, quad.Object)
subit = hasa
} else {
lto := iterator.NewLinksTo(q.ses.ts, builtIt, graph.Object)
lto := iterator.NewLinksTo(q.ses.ts, builtIt, quad.Object)
subAnd.AddSubIterator(lto)
hasa := iterator.NewHasA(q.ses.ts, subAnd, graph.Subject)
hasa := iterator.NewHasA(q.ses.ts, subAnd, quad.Subject)
subit = hasa
}
}

View file

@ -21,6 +21,7 @@ import (
"github.com/google/cayley/graph"
_ "github.com/google/cayley/graph/memstore"
"github.com/google/cayley/quad"
)
// This is a simple test graph.
@ -36,7 +37,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +51,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestSession(data []*graph.Triple) *Session {
func makeTestSession(data []*quad.Quad) *Session {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, t := range data {
ts.AddTriple(t)
@ -164,7 +165,7 @@ var testQueries = []struct {
},
}
func runQuery(g []*graph.Triple, query string) interface{} {
func runQuery(g []*quad.Quad, query string) interface{} {
s := makeTestSession(g)
c := make(chan interface{}, 5)
go s.ExecInput(query, c, -1)

View file

@ -88,7 +88,7 @@ func (s *Session) ExecInput(input string, c chan interface{}, limit int) {
glog.V(2).Infoln(it.DebugString(0))
}
for {
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}

View file

@ -48,7 +48,7 @@ $(function() {
subject: $("#subject").val(),
predicate: $("#predicate").val(),
object: $("#object").val(),
provenance: $("#provenance").val()
label: $("#label").val()
}
if (!checkTriple(triple)) {
return
@ -68,7 +68,7 @@ $(function() {
subject: $("#rsubject").val(),
predicate: $("#rpredicate").val(),
object: $("#robject").val(),
provenance: $("#rprovenance").val()
label: $("#rlabel").val()
}
if (!checkTriple(triple)) {
return

View file

@ -45,7 +45,7 @@
<input id="subject" type="text" placeholder="Subject"></input>
<input id="predicate" type="text" placeholder="Predicate"></input>
<input id="object" type="text" placeholder="Object"></input>
<input id="provenance" type="text" placeholder="Provenance"></input>
<input id="label" type="text" placeholder="Label"></input>
</div>
</div>
<div class="row button-row">
@ -59,7 +59,7 @@
<input id="rsubject" type="text" placeholder="Subject"></input>
<input id="rpredicate" type="text" placeholder="Predicate"></input>
<input id="robject" type="text" placeholder="Object"></input>
<input id="rprovenance" type="text" placeholder="Provenance"></input>
<input id="rlabel" type="text" placeholder="Label"></input>
</div>
</div><!-- /.col-xs-12 main -->
<div class="row button-row">

9
testdata.nq Normal file
View file

@ -0,0 +1,9 @@
<alice> <follows> <bob> .
<bob> <follows> <alice> .
<charlie> <follows> <bob> .
<dani> <follows> <charlie> .
<dani> <follows> <alice> .
<alice> <is> "cool" .
<bob> <is> "not cool" .
<charlie> <is> "cool" .
<dani> <is> "not cool" .

View file

@ -1,9 +0,0 @@
alice follows bob .
bob follows alice .
charlie follows bob .
dani follows charlie .
dani follows alice .
alice is cool .
bob is "not cool" .
charlie is cool .
dani is "not cool" .