update to master

This commit is contained in:
Barak Michener 2014-08-04 00:39:19 -04:00
commit cedaac35d0
93 changed files with 15683 additions and 2361 deletions

View file

@ -1,6 +1,6 @@
{
"Arch": "amd64 386",
"Arch": "arm amd64 386",
"Os": "linux darwin windows",
"ResourcesInclude": "README.md,static,templates,LICENSE,AUTHORS,CONTRIBUTORS,docs,cayley.cfg.example,30kmoviedata.nt.gz,testdata.nt",
"ResourcesInclude": "README.md,static,templates,LICENSE,AUTHORS,CONTRIBUTORS,docs,cayley.cfg.example,30kmoviedata.nq.gz,testdata.nq",
"ConfigVersion": "0.9"
}

View file

@ -19,3 +19,6 @@ install:
- go get github.com/syndtr/goleveldb/leveldb/util
- go get gopkg.in/mgo.v2
- go get gopkg.in/mgo.v2/bson
script: go test -v -short ./...

BIN
30kmoviedata.nq.gz Normal file

Binary file not shown.

Binary file not shown.

View file

@ -72,13 +72,13 @@ cayley> graph.Vertex("dani").Out("follows").All()
For somewhat more interesting data, a sample of 30k movies from Freebase comes in the checkout.
```
./cayley repl --dbpath=30kmoviedata.nt.gz
./cayley repl --dbpath=30kmoviedata.nq.gz
```
To run the web frontend, replace the "repl" command with "http"
```
./cayley http --dbpath=30kmoviedata.nt.gz
./cayley http --dbpath=30kmoviedata.nq.gz
```
And visit port 64210 on your machine, commonly [http://localhost:64210](http://localhost:64210)
@ -90,13 +90,13 @@ The default environment is based on [Gremlin](http://gremlindocs.com/) and is si
You'll notice we have a special object, `graph` or `g`, which is how you can interact with the graph.
The simplest query is merely to return a single vertex. Using the 30kmovies.nt dataset from above, let's walk through some simple queries:
The simplest query is merely to return a single vertex. Using the 30kmoviedata.nq dataset from above, let's walk through some simple queries:
```javascript
// Query all vertices in the graph, limit to the first 5 vertices found.
graph.Vertex().GetLimit(5)
// Start with only one vertex, the literal name "Humphrey Bogart", and retreive all of them.
// Start with only one vertex, the literal name "Humphrey Bogart", and retrieve all of them.
graph.Vertex("Humphrey Bogart").All()
// `g` and `V` are synonyms for `graph` and `Vertex` respectively, as they are quite common.

18
TODO.md
View file

@ -26,11 +26,11 @@ Usually something that should be taken care of.
### Bootstraps
Start discussing bootstrap triples, things that make the database self-describing, if they exist (though they need not). Talk about sameAs and indexing and type systems and whatnot.
### Better surfacing of Provenance
### Better surfacing of Label
It exists, it's indexed, but it's basically useless right now
### Optimize HasA Iterator
There are some simple optimizations that can be done there. And was the first one to get right, this is the next one.
There are some simple optimizations that can be done there. And was the first one to get right, this is the next one.
A simple example is just to convert the HasA to a fixed (next them out) if the subiterator size is guessable and small.
### Gremlin features
@ -39,7 +39,7 @@ A simple example is just to convert the HasA to a fixed (next them out) if the s
A way to limit the number of subresults at a point, without even running the query. Essentially, much as GetLimit() does for the end, be able to do the same in between
#### "Up" and "Down" traversals
Getting to the predicates from a node, or the nodes from a predicate, or some odd combinations thereof. Ditto for provenance.
Getting to the predicates from a node, or the nodes from a predicate, or some odd combinations thereof. Ditto for label.
#### Value comparison
Expose the value-comparison iterator in the language
@ -52,7 +52,7 @@ An important failure of MQL before was that it was never well-specified. Let's n
### New Iterators
#### Limit Iterator
The necessary component to make mid-query limit work. Acts as a limit on Next(), a passthrough on Check(), and a limit on NextResult()
The necessary component to make mid-query limit work. Acts as a limit on Next(), a passthrough on Contains(), and a limit on NextResult()
## Medium Term
@ -66,7 +66,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
Hopefully easy now that the AppEngine shim exists. Questionably fast.
### Postgres Backend
It'd be nice to run on SQL as well. It's a big why not?
It'd be nice to run on SQL as well. It's a big why not?
#### Generalist layout
Notionally, this is a simple triple table with a number of indicies. Iterators and iterator optimization (ie, rewriting SQL queries) is the 'fun' part
#### "Short Schema" Layout?
@ -75,7 +75,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
### New Iterators
#### Predicate Iterator
Really, this is just the generalized value comparison iterator, across strings and dates and such.
Really, this is just the generalized value comparison iterator, across strings and dates and such.
## Longer Term (and fuzzy)
@ -83,7 +83,7 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
There's a whole body of work there, and a lot of interested researchers. They're the choir who already know the sermon of graph stores. Once ease-of-use gets people in the door, supporting extensions that make everyone happy seems like a win. And because we're query-language agnostic, it's a cleaner win. See also bootstrapping, which is the first goal toward this (eg, let's talk about sameAs, and index it appropriately.)
### Replication
Technically it works now if you piggyback on someone else's replication, but that's cheating. We speak HTTP, we can send triple sets over the wire to some other instance. Bonus points for a way to apply morphisms first -- massive graph on the backend, important graph on the frontend.
Technically it works now if you piggyback on someone else's replication, but that's cheating. We speak HTTP, we can send triple sets over the wire to some other instance. Bonus points for a way to apply morphisms first -- massive graph on the backend, important graph on the frontend.
### Related services
Eg, topic service, recon service -- whether in Cayley itself or as part of the greater project.
@ -102,6 +102,6 @@ The necessary component to make mid-query limit work. Acts as a limit on Next(),
### All sorts of backends:
#### Git?
Can we access git in a meaningful fashion, giving a history and rollbacks to memory/flat files?
#### ElasticSearch
#### Cassandra
#### ElasticSearch
#### Cassandra
#### Redis

View file

@ -40,15 +40,20 @@ var cpuprofile = flag.String("prof", "", "Output profiling file.")
var queryLanguage = flag.String("query_lang", "gremlin", "Use this parser as the query language.")
var configFile = flag.String("config", "", "Path to an explicit configuration file.")
// Filled in by `go build ldflags="-X main.VERSION `ver`"`.
var BUILD_DATE string
var VERSION string
func Usage() {
fmt.Println("Cayley is a graph store and graph query layer.")
fmt.Println("\nUsage:")
fmt.Println(" cayley COMMAND [flags]")
fmt.Println("\nCommands:")
fmt.Println(" init\tCreate an empty database.")
fmt.Println(" load\tBulk-load a triple file into the database.")
fmt.Println(" http\tServe an HTTP endpoint on the given host and port.")
fmt.Println(" repl\tDrop into a REPL of the given query language.")
fmt.Println(" init Create an empty database.")
fmt.Println(" load Bulk-load a triple file into the database.")
fmt.Println(" http Serve an HTTP endpoint on the given host and port.")
fmt.Println(" repl Drop into a REPL of the given query language.")
fmt.Println(" version Version information.")
fmt.Println("\nFlags:")
flag.Parse()
flag.PrintDefaults()
@ -62,12 +67,18 @@ func main() {
}
cmd := os.Args[1]
newargs := make([]string, 0)
var newargs []string
newargs = append(newargs, os.Args[0])
newargs = append(newargs, os.Args[2:]...)
os.Args = newargs
flag.Parse()
var buildString string
if VERSION != "" {
buildString = fmt.Sprint("Cayley ", VERSION, " built ", BUILD_DATE)
glog.Infoln(buildString)
}
cfg := config.ParseConfigFromFlagsAndFile(*configFile)
if os.Getenv("GOMAXPROCS") == "" {
@ -82,6 +93,13 @@ func main() {
err error
)
switch cmd {
case "version":
if VERSION != "" {
fmt.Println(buildString)
} else {
fmt.Println("Cayley snapshot")
}
os.Exit(0)
case "init":
err = db.Init(cfg, *tripleFile)
case "load":

View file

@ -1,7 +1,7 @@
{
"database": "mem",
"db_path": "30k.nt",
"db_path": "30kmoviedata.nq.gz",
"read_only": true,
"load_size": 10000,
"gremlin_timeout": 10
"timeout": 10
}

409
cayley_test.go Normal file
View file

@ -0,0 +1,409 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"sync"
"testing"
"time"
"github.com/google/cayley/config"
"github.com/google/cayley/db"
"github.com/google/cayley/graph"
"github.com/google/cayley/query/gremlin"
)
var benchmarkQueries = []struct {
message string
long bool
query string
tag string
expect [][]interface{}
}{
// Easy one to get us started. How quick is the most straightforward retrieval?
{
message: "name predicate",
query: `
g.V("Humphrey Bogart").In("name").All()
`,
expect: [][]interface{}{
{map[string]string{"id": "/en/humphrey_bogart"}},
},
},
// Grunty queries.
// 2014-07-12: This one seems to return in ~20ms in memory;
// that's going to be measurably slower for every other backend.
{
message: "two large sets with no intersection",
query: `
function getId(x) { return g.V(x).In("name") }
var actor_to_film = g.M().In("/film/performance/actor").In("/film/film/starring")
getId("Oliver Hardy").Follow(actor_to_film).Out("name").Intersect(
getId("Mel Blanc").Follow(actor_to_film).Out("name")).All()
`,
expect: nil,
},
// 2014-07-12: This one takes about 4 whole seconds in memory. This is a behemoth.
{
message: "three huge sets with small intersection",
long: true,
query: `
function getId(x) { return g.V(x).In("name") }
var actor_to_film = g.M().In("/film/performance/actor").In("/film/film/starring")
var a = getId("Oliver Hardy").Follow(actor_to_film).FollowR(actor_to_film)
var b = getId("Mel Blanc").Follow(actor_to_film).FollowR(actor_to_film)
var c = getId("Billy Gilbert").Follow(actor_to_film).FollowR(actor_to_film)
seen = {}
a.Intersect(b).Intersect(c).ForEach(function (d) {
if (!(d.id in seen)) {
seen[d.id] = true;
g.Emit(d.id)
}
})
`,
expect: [][]interface{}{
{"/en/billy_gilbert"},
{"/en/sterling_holloway"},
},
},
// This is more of an optimization problem that will get better over time. This takes a lot
// of wrong turns on the walk down to what is ultimately the name, but top AND has it easy
// as it has a fixed ID. Exercises Contains().
{
message: "the helpless checker",
long: true,
query: `
g.V().As("person").In("name").In().In().Out("name").Is("Casablanca").All()
`,
tag: "person",
expect: [][]interface{}{
{map[string]string{"id": "Casablanca", "person": "Claude Rains"}},
{map[string]string{"id": "Casablanca", "person": "Conrad Veidt"}},
{map[string]string{"id": "Casablanca", "person": "Dooley Wilson"}},
{map[string]string{"id": "Casablanca", "person": "Helmut Dantine"}},
{map[string]string{"id": "Casablanca", "person": "Humphrey Bogart"}},
{map[string]string{"id": "Casablanca", "person": "Ingrid Bergman"}},
{map[string]string{"id": "Casablanca", "person": "John Qualen"}},
{map[string]string{"id": "Casablanca", "person": "Joy Page"}},
{map[string]string{"id": "Casablanca", "person": "Leonid Kinskey"}},
{map[string]string{"id": "Casablanca", "person": "Lou Marcelle"}},
{map[string]string{"id": "Casablanca", "person": "Madeleine LeBeau"}},
{map[string]string{"id": "Casablanca", "person": "Paul Henreid"}},
{map[string]string{"id": "Casablanca", "person": "Peter Lorre"}},
{map[string]string{"id": "Casablanca", "person": "Sydney Greenstreet"}},
{map[string]string{"id": "Casablanca", "person": "S.Z. Sakall"}},
},
},
//Q: Who starred in both "The Net" and "Speed" ?
//A: "Sandra Bullock"
{
message: "Net and Speed",
query: common + `m1_actors.Intersect(m2_actors).Out("name").All()
`,
expect: [][]interface{}{
{map[string]string{"id": "Sandra Bullock", "movie1": "The Net", "movie2": "Speed"}},
},
},
//Q: Did "Keanu Reeves" star in "The Net" ?
//A: No
{
message: "Keanu in The Net",
query: common + `actor2.Intersect(m1_actors).Out("name").All()
`,
expect: nil,
},
//Q: Did "Keanu Reeves" star in "Speed" ?
//A: Yes
{
message: "Keanu in Speed",
query: common + `actor2.Intersect(m2_actors).Out("name").All()
`,
expect: [][]interface{}{
{map[string]string{"id": "Keanu Reeves", "movie2": "Speed"}},
},
},
//Q: Has "Keanu Reeves" co-starred with anyone who starred in "The Net" ?
//A: "Keanu Reeves" was in "Speed" and "The Lake House" with "Sandra Bullock",
// who was in "The Net"
{
message: "Keanu with other in The Net",
long: true,
query: common + `actor2.Follow(coStars1).Intersect(m1_actors).Out("name").All()
`,
expect: [][]interface{}{
{map[string]string{"id": "Sandra Bullock", "movie1": "The Net", "costar1_movie": "Speed"}},
{map[string]string{"movie1": "The Net", "costar1_movie": "The Lake House", "id": "Sandra Bullock"}},
},
},
//Q: Do "Keanu Reeves" and "Sandra Bullock" have any commons co-stars?
//A: Yes, many. For example: SB starred with "Steve Martin" in "The Prince
// of Egypt", and KR starred with Steven Martin in "Parenthood".
{
message: "Keanu and Bullock with other",
long: true,
query: common + `actor1.Save("name","costar1_actor").Follow(coStars1).Intersect(actor2.Save("name","costar2_actor").Follow(coStars2)).Out("name").All()
`,
expect: [][]interface{}{
{map[string]string{"costar2_movie": "Speed", "id": "Alan Ruck", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "Demolition Man", "costar2_actor": "Keanu Reeves", "costar2_movie": "Thumbsucker", "id": "Benjamin Bratt", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Miss Congeniality", "costar2_actor": "Keanu Reeves", "costar2_movie": "Thumbsucker", "id": "Benjamin Bratt"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Beth Grant"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Carlos Carrasco"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Lake House", "id": "Christopher Plummer"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Proposal", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Devil's Advocate", "id": "Craig T. Nelson"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "River's Edge", "id": "Dennis Hopper"}},
{map[string]string{"costar2_movie": "Speed", "id": "Dennis Hopper", "costar1_actor": "/people/person", "costar1_movie": "Chattahoochee", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Practical Magic", "costar2_actor": "Keanu Reeves", "costar2_movie": "Parenthood", "id": "Dianne Wiest"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Lake House", "id": "Dylan Walsh"}},
{map[string]string{"costar2_movie": "Speed", "id": "Glenn Plummer", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed 2: Cruise Control", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Glenn Plummer"}},
{map[string]string{"costar1_movie": "While You Were Sleeping", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Replacements", "id": "Jack Warden", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Infamous", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Jeff Daniels"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Jeff Daniels"}},
{map[string]string{"id": "Joe Morton", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Jordan Lund", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Flying", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House", "costar2_actor": "Keanu Reeves", "costar2_movie": "Flying", "id": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "The Day the Earth Stood Still", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Animatrix", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Tune in Tomorrow"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Last Time I Committed Suicide", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Constantine", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Permanent Record", "id": "Keanu Reeves"}},
{map[string]string{"costar2_movie": "Dangerous Liaisons", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Private Lives of Pippa Lee", "id": "Keanu Reeves"}},
{map[string]string{"costar2_movie": "A Scanner Darkly", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "A Walk in the Clouds", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Hardball", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Life Under Water", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Much Ado About Nothing", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "My Own Private Idaho", "id": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Parenthood", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Point Break", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Providence", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "River's Edge", "id": "Keanu Reeves"}},
{map[string]string{"costar2_movie": "Something's Gotta Give", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Sweet November", "id": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Lake House", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Matrix Reloaded"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Matrix Revisited", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Prince of Pennsylvania", "id": "Keanu Reeves"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "The Replacements", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Even Cowgirls Get the Blues", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Youngblood", "id": "Keanu Reeves"}},
{map[string]string{"costar2_movie": "Bill & Ted's Bogus Journey", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Bill & Ted's Excellent Adventure", "id": "Keanu Reeves"}},
{map[string]string{"id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Johnny Mnemonic"}},
{map[string]string{"costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Devil's Advocate", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Thumbsucker", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "I Love You to Death", "id": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Bram Stoker's Dracula", "id": "Keanu Reeves", "costar1_actor": "/people/person"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "The Gift", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film"}},
{map[string]string{"costar2_movie": "Little Buddha", "id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Night Watchman", "id": "Keanu Reeves"}},
{map[string]string{"id": "Keanu Reeves", "costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Chain Reaction"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "Babes in Toyland", "id": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Day the Earth Stood Still", "id": "Keanu Reeves"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "The Lake House", "id": "Lynn Collins", "costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Proposal", "costar2_actor": "Keanu Reeves", "costar2_movie": "Parenthood", "id": "Mary Steenburgen"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Prince of Egypt", "costar2_actor": "Keanu Reeves", "costar2_movie": "Dangerous Liaisons", "id": "Michelle Pfeiffer"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Premonition", "costar2_actor": "Keanu Reeves", "costar2_movie": "Constantine", "id": "Peter Stormare"}},
{map[string]string{"costar2_movie": "Speed", "id": "Richard Lineback", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_movie": "The Thing Called Love", "costar2_actor": "Keanu Reeves", "costar2_movie": "My Own Private Idaho", "id": "River Phoenix", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "I Love You to Death", "id": "River Phoenix"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Proposal", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Crash", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Gun Shy", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Demolition Man"}},
{map[string]string{"costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Divine Secrets of the Ya-Ya Sisterhood", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "A Time to Kill", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Forces of Nature", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Hope Floats", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Infamous", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Love Potion No. 9", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Miss Congeniality", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed"}},
{map[string]string{"id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Miss Congeniality 2: Armed and Fabulous", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Murder by Numbers", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Practical Magic", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Speed"}},
{map[string]string{"costar1_movie": "Speed 2: Cruise Control", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Net", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_movie": "The Prince of Egypt", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Two Weeks Notice"}},
{map[string]string{"costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "While You Were Sleeping", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "28 Days", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Premonition"}},
{map[string]string{"costar1_movie": "Wrestling Ernest Hemingway", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock", "costar1_movie": "Fire on the Amazon"}},
{map[string]string{"costar1_movie": "The Thing Called Love", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock", "costar1_actor": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "In Love and War", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "/people/person", "costar1_movie": "/film/film", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Lake House", "id": "Sandra Bullock"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Divine Secrets of the Ya-Ya Sisterhood", "costar2_actor": "Keanu Reeves", "costar2_movie": "The Private Lives of Pippa Lee", "id": "Shirley Knight"}},
{map[string]string{"costar2_movie": "The Lake House", "id": "Shohreh Aghdashloo", "costar1_actor": "Sandra Bullock", "costar1_movie": "The Lake House", "costar2_actor": "Keanu Reeves"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "The Prince of Egypt", "costar2_actor": "Keanu Reeves", "costar2_movie": "Parenthood", "id": "Steve Martin"}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Thomas Rosales, Jr."}},
{map[string]string{"costar1_actor": "Sandra Bullock", "costar1_movie": "Speed", "costar2_actor": "Keanu Reeves", "costar2_movie": "Speed", "id": "Hawthorne James"}},
},
},
}
const common = `
var movie1 = g.V().Has("name", "The Net")
var movie2 = g.V().Has("name", "Speed")
var actor1 = g.V().Has("name", "Sandra Bullock")
var actor2 = g.V().Has("name", "Keanu Reeves")
// (film) -> starring -> (actor)
var filmToActor = g.Morphism().Out("/film/film/starring").Out("/film/performance/actor")
// (actor) -> starring -> [film -> starring -> (actor)]
var coStars1 = g.Morphism().In("/film/performance/actor").In("/film/film/starring").Save("name","costar1_movie").Follow(filmToActor)
var coStars2 = g.Morphism().In("/film/performance/actor").In("/film/film/starring").Save("name","costar2_movie").Follow(filmToActor)
// Stars for the movies "The Net" and "Speed"
var m1_actors = movie1.Save("name","movie1").Follow(filmToActor)
var m2_actors = movie2.Save("name","movie2").Follow(filmToActor)
`
var (
once sync.Once
cfg = &config.Config{
DatabasePath: "30kmoviedata.nq.gz",
DatabaseType: "memstore",
Timeout: 300 * time.Second,
}
ts graph.TripleStore
)
func prepare(t testing.TB) {
var err error
once.Do(func() {
ts, err = db.Open(cfg)
if err != nil {
t.Fatalf("Failed to open %q: %v", cfg.DatabasePath, err)
}
})
}
func TestQueries(t *testing.T) {
prepare(t)
for _, test := range benchmarkQueries {
if testing.Short() && test.long {
continue
}
ses := gremlin.NewSession(ts, cfg.Timeout, true)
_, err := ses.InputParses(test.query)
if err != nil {
t.Fatalf("Failed to parse benchmark gremlin %s: %v", test.message, err)
}
c := make(chan interface{}, 5)
go ses.ExecInput(test.query, c, 100)
var (
got [][]interface{}
timedOut bool
)
for r := range c {
ses.BuildJson(r)
j, err := ses.GetJson()
if j == nil && err == nil {
continue
}
if err == gremlin.ErrKillTimeout {
timedOut = true
continue
}
got = append(got, j)
}
if timedOut {
t.Error("Query timed out: skipping validation.")
continue
}
// TODO(kortschak) Be more rigorous in this result validation.
if len(got) != len(test.expect) {
t.Errorf("Unexpected number of results, got:%d expect:%d.", len(got), len(test.expect))
}
}
}
func runBench(n int, b *testing.B) {
if testing.Short() && benchmarkQueries[n].long {
b.Skip()
}
prepare(b)
ses := gremlin.NewSession(ts, cfg.Timeout, true)
_, err := ses.InputParses(benchmarkQueries[n].query)
if err != nil {
b.Fatalf("Failed to parse benchmark gremlin %s: %v", benchmarkQueries[n].message, err)
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
c := make(chan interface{}, 5)
go ses.ExecInput(benchmarkQueries[n].query, c, 100)
for _ = range c {
}
}
}
func BenchmarkNamePredicate(b *testing.B) {
runBench(0, b)
}
func BenchmarkLargeSetsNoIntersection(b *testing.B) {
runBench(1, b)
}
func BenchmarkVeryLargeSetsSmallIntersection(b *testing.B) {
runBench(2, b)
}
func BenchmarkHelplessContainsChecker(b *testing.B) {
runBench(3, b)
}
func BenchmarkNetAndSpeed(b *testing.B) {
runBench(4, b)
}
func BenchmarkKeannuAndNet(b *testing.B) {
runBench(5, b)
}
func BenchmarkKeannuAndSpeed(b *testing.B) {
runBench(6, b)
}
func BenchmarkKeannuOther(b *testing.B) {
runBench(7, b)
}
func BenchmarkKeannuBullockOther(b *testing.B) {
runBench(8, b)
}

View file

@ -17,29 +17,112 @@ package config
import (
"encoding/json"
"flag"
"fmt"
"os"
"strconv"
"time"
"github.com/barakmich/glog"
)
type Config struct {
DatabaseType string
DatabasePath string
DatabaseOptions map[string]interface{}
ListenHost string
ListenPort string
ReadOnly bool
Timeout time.Duration
LoadSize int
}
type config struct {
DatabaseType string `json:"database"`
DatabasePath string `json:"db_path"`
DatabaseOptions map[string]interface{} `json:"db_options"`
ListenHost string `json:"listen_host"`
ListenPort string `json:"listen_port"`
ReadOnly bool `json:"read_only"`
GremlinTimeout int `json:"gremlin_timeout"`
Timeout duration `json:"timeout"`
LoadSize int `json:"load_size"`
}
var databasePath = flag.String("dbpath", "/tmp/testdb", "Path to the database.")
var databaseBackend = flag.String("db", "memstore", "Database Backend.")
var host = flag.String("host", "0.0.0.0", "Host to listen on (defaults to all).")
var loadSize = flag.Int("load_size", 10000, "Size of triplesets to load")
var port = flag.String("port", "64210", "Port to listen on.")
var readOnly = flag.Bool("read_only", false, "Disable writing via HTTP.")
var gremlinTimeout = flag.Int("gremlin_timeout", 30, "Number of seconds until an individual query times out.")
func (c *Config) UnmarshalJSON(data []byte) error {
var t config
err := json.Unmarshal(data, &t)
if err != nil {
return err
}
*c = Config{
DatabaseType: t.DatabaseType,
DatabasePath: t.DatabasePath,
DatabaseOptions: t.DatabaseOptions,
ListenHost: t.ListenHost,
ListenPort: t.ListenPort,
ReadOnly: t.ReadOnly,
Timeout: time.Duration(t.Timeout),
LoadSize: t.LoadSize,
}
return nil
}
func (c *Config) MarshalJSON() ([]byte, error) {
return json.Marshal(config{
DatabaseType: c.DatabaseType,
DatabasePath: c.DatabasePath,
DatabaseOptions: c.DatabaseOptions,
ListenHost: c.ListenHost,
ListenPort: c.ListenPort,
ReadOnly: c.ReadOnly,
Timeout: duration(c.Timeout),
LoadSize: c.LoadSize,
})
}
// duration is a time.Duration that satisfies the
// json.UnMarshaler and json.Marshaler interfaces.
type duration time.Duration
// UnmarshalJSON unmarshals a duration according to the following scheme:
// * If the element is absent the duration is zero.
// * If the element is parsable as a time.Duration, the parsed value is kept.
// * If the element is parsable as a number, that number of seconds is kept.
func (d *duration) UnmarshalJSON(data []byte) error {
if len(data) == 0 {
*d = 0
return nil
}
text := string(data)
t, err := time.ParseDuration(text)
if err == nil {
*d = duration(t)
return nil
}
i, err := strconv.ParseInt(text, 10, 64)
if err == nil {
*d = duration(time.Duration(i) * time.Second)
return nil
}
// This hack is to get around strconv.ParseFloat
// not handling e-notation for integers.
f, err := strconv.ParseFloat(text, 64)
*d = duration(time.Duration(f) * time.Second)
return err
}
func (d *duration) MarshalJSON() ([]byte, error) {
return []byte(fmt.Sprintf("%q", *d)), nil
}
var (
databasePath = flag.String("dbpath", "/tmp/testdb", "Path to the database.")
databaseBackend = flag.String("db", "memstore", "Database Backend.")
host = flag.String("host", "0.0.0.0", "Host to listen on (defaults to all).")
loadSize = flag.Int("load_size", 10000, "Size of triplesets to load")
port = flag.String("port", "64210", "Port to listen on.")
readOnly = flag.Bool("read_only", false, "Disable writing via HTTP.")
timeout = flag.Duration("timeout", 30*time.Second, "Elapsed time until an individual query times out.")
)
func ParseConfigFromFile(filename string) *Config {
config := &Config{}
@ -100,8 +183,8 @@ func ParseConfigFromFlagsAndFile(fileFlag string) *Config {
config.ListenPort = *port
}
if config.GremlinTimeout == 0 {
config.GremlinTimeout = *gremlinTimeout
if config.Timeout == 0 {
config.Timeout = *timeout
}
if config.LoadSize == 0 {

View file

@ -25,7 +25,8 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/config"
"github.com/google/cayley/graph"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad"
"github.com/google/cayley/quad/cquads"
)
func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
@ -40,7 +41,7 @@ func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
glog.Fatalln(err)
}
dec := nquads.NewDecoder(r)
dec := cquads.NewDecoder(r)
bulker, canBulk := ts.(graph.BulkLoader)
if canBulk {
@ -56,7 +57,7 @@ func Load(ts graph.TripleStore, cfg *config.Config, path string) error {
return err
}
block := make([]*graph.Triple, 0, cfg.LoadSize)
block := make([]*quad.Quad, 0, cfg.LoadSize)
for {
t, err := dec.Unmarshal()
if err != nil {

View file

@ -22,7 +22,7 @@ import (
)
func Open(cfg *config.Config) (graph.TripleStore, error) {
glog.Infof("Opening database \"%s\" at %s", cfg.DatabaseType, cfg.DatabasePath)
glog.Infof("Opening database %q at %s", cfg.DatabaseType, cfg.DatabasePath)
ts, err := graph.NewTripleStore(cfg.DatabaseType, cfg.DatabasePath, cfg.DatabaseOptions)
if err != nil {
return nil, err

View file

@ -25,10 +25,11 @@ import (
"github.com/google/cayley/config"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/sexp"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad/cquads"
"github.com/google/cayley/query"
"github.com/google/cayley/query/gremlin"
"github.com/google/cayley/query/mql"
"github.com/google/cayley/query/sexp"
)
func trace(s string) (string, time.Time) {
@ -41,7 +42,7 @@ func un(s string, startTime time.Time) {
fmt.Printf(s, float64(endTime.UnixNano()-startTime.UnixNano())/float64(1E6))
}
func Run(query string, ses graph.Session) {
func Run(query string, ses query.Session) {
nResults := 0
startTrace, startTime := trace("Elapsed time: %g ms\n\n")
defer func() {
@ -62,7 +63,7 @@ func Run(query string, ses graph.Session) {
}
func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error {
var ses graph.Session
var ses query.Session
switch queryLanguage {
case "sexp":
ses = sexp.NewSession(ts)
@ -71,7 +72,7 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
case "gremlin":
fallthrough
default:
ses = gremlin.NewSession(ts, cfg.GremlinTimeout, true)
ses = gremlin.NewSession(ts, cfg.Timeout, true)
}
buf := bufio.NewReader(os.Stdin)
var line []byte
@ -99,6 +100,11 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
if len(line) == 0 {
continue
}
line = bytes.TrimSpace(line)
if len(line) == 0 || line[0] == '#' {
line = line[:0]
continue
}
if bytes.HasPrefix(line, []byte(":debug")) {
ses.ToggleDebug()
fmt.Println("Debug Toggled")
@ -107,7 +113,7 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
}
if bytes.HasPrefix(line, []byte(":a")) {
var tripleStmt = line[3:]
triple, err := nquads.Parse(string(tripleStmt))
triple, err := cquads.Parse(string(tripleStmt))
if triple == nil {
if err != nil {
fmt.Printf("not a valid triple: %v\n", err)
@ -121,7 +127,7 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
}
if bytes.HasPrefix(line, []byte(":d")) {
var tripleStmt = line[3:]
triple, err := nquads.Parse(string(tripleStmt))
triple, err := cquads.Parse(string(tripleStmt))
if triple == nil {
if err != nil {
fmt.Printf("not a valid triple: %v\n", err)
@ -135,13 +141,13 @@ func Repl(ts graph.TripleStore, queryLanguage string, cfg *config.Config) error
}
result, err := ses.InputParses(string(line))
switch result {
case graph.Parsed:
case query.Parsed:
Run(string(line), ses)
line = line[:0]
case graph.ParseFail:
case query.ParseFail:
fmt.Println("Error: ", err)
line = line[:0]
case graph.ParseMore:
case query.ParseMore:
}
}
}

View file

@ -72,12 +72,12 @@ All command line flags take precedence over the configuration file.
## Language Options
#### **`gremlin_timeout`**
#### **`timeout`**
* Type: Integer
* Type: Integer or String
* Default: 30
The value in seconds of the maximum length of time the Javascript runtime should run until cancelling the query and returning a 408 Timeout. A negative value means no limit.
The maximum length of time the Javascript runtime should run until cancelling the query and returning a 408 Timeout. When timeout is an integer is is interpretted as seconds, when it is a string it is [parsed](http://golang.org/pkg/time/#ParseDuration) as a Go time.Duration. A negative duration means no limit.
## Per-Database Options

View file

@ -93,7 +93,7 @@ POST Body: JSON triples
"subject": "Subject Node",
"predicate": "Predicate Node",
"object": "Object node",
"provenance": "Provenance node" // Optional
"label": "Label node" // Optional
}] // More than one triple allowed.
```
@ -121,7 +121,7 @@ POST Body: JSON triples
"subject": "Subject Node",
"predicate": "Predicate Node",
"object": "Object node",
"provenance": "Provenance node" // Optional
"label": "Label node" // Optional
}] // More than one triple allowed.
```

View file

@ -28,13 +28,13 @@ You can repeat the `--db` and `--dbpath` flags from here forward instead of the
First we load the data.
```bash
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nt.gz
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nq.gz
```
And wait. It will load. If you'd like to watch it load, you can run
```bash
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nt.gz --alsologtostderr
./cayley load --config=cayley.cfg.overview --triples=30kmoviedata.nq.gz --alsologtostderr
```
And watch the log output go by.

View file

@ -14,8 +14,7 @@
package graph
// Define the general iterator interface, as well as the Base iterator which all
// iterators can "inherit" from to get default iterator functionality.
// Define the general iterator interface.
import (
"strings"
@ -24,18 +23,46 @@ import (
"github.com/barakmich/glog"
)
type Tagger struct {
tags []string
fixedTags map[string]Value
}
// Adds a tag to the iterator.
func (t *Tagger) Add(tag string) {
t.tags = append(t.tags, tag)
}
func (t *Tagger) AddFixed(tag string, value Value) {
if t.fixedTags == nil {
t.fixedTags = make(map[string]Value)
}
t.fixedTags[tag] = value
}
// Returns the tags. The returned value must not be mutated.
func (t *Tagger) Tags() []string {
return t.tags
}
// Returns the fixed tags. The returned value must not be mutated.
func (t *Tagger) Fixed() map[string]Value {
return t.fixedTags
}
func (t *Tagger) CopyFrom(src Iterator) {
for _, tag := range src.Tagger().Tags() {
t.Add(tag)
}
for k, v := range src.Tagger().Fixed() {
t.AddFixed(k, v)
}
}
type Iterator interface {
// Tags are the way we handle results. By adding a tag to an iterator, we can
// "name" it, in a sense, and at each step of iteration, get a named result.
// TagResults() is therefore the handy way of walking an iterator tree and
// getting the named results.
//
// Tag Accessors.
AddTag(string)
Tags() []string
AddFixedTag(string, Value)
FixedTags() map[string]Value
CopyTagsFrom(Iterator)
Tagger() *Tagger
// Fills a tag-to-result-value map.
TagResults(map[string]Value)
@ -58,22 +85,12 @@ type Iterator interface {
// All of them should set iterator.Last to be the last returned value, to
// make results work.
//
// Next() advances the iterator and returns the next valid result. Returns
// (<value>, true) or (nil, false)
Next() (Value, bool)
// NextResult() advances iterators that may have more than one valid result,
// from the bottom up.
NextResult() bool
// Return whether this iterator is reliably nextable. Most iterators are.
// However, some iterators, like "not" are, by definition, the whole database
// except themselves. Next() on these is unproductive, if impossible.
CanNext() bool
// Check(), given a value, returns whether or not that value is within the set
// held by this iterator.
Check(Value) bool
// Contains returns whether the value is within the set held by the iterator.
Contains(Value) bool
// Start iteration from the beginning
Reset()
@ -114,7 +131,26 @@ type Iterator interface {
Close()
// UID returns the unique identifier of the iterator.
UID() uintptr
UID() uint64
}
type Nexter interface {
// Next() advances the iterator and returns the next valid result. Returns
// (<value>, true) or (nil, false)
Next() (Value, bool)
Iterator
}
// Next is a convenience function that conditionally calls the Next method
// of an Iterator if it is a Nexter. If the Iterator is not a Nexter, Next
// return a nil Value and false.
func Next(it Iterator) (Value, bool) {
if n, ok := it.(Nexter); ok {
return n.Next()
}
glog.Errorln("Nexting an un-nextable iterator")
return nil, false
}
// FixedIterator wraps iterators that are modifiable by addition of fixed value sets.
@ -124,9 +160,9 @@ type FixedIterator interface {
}
type IteratorStats struct {
CheckCost int64
NextCost int64
Size int64
ContainsCost int64
NextCost int64
Size int64
}
// Type enumerates the set of Iterator types.
@ -192,20 +228,20 @@ func (t Type) String() string {
return types[t]
}
// Utility logging functions for when an iterator gets called Next upon, or Check upon, as
// Utility logging functions for when an iterator gets called Next upon, or Contains upon, as
// well as what they return. Highly useful for tracing the execution path of a query.
func CheckLogIn(it Iterator, val Value) {
func ContainsLogIn(it Iterator, val Value) {
if glog.V(4) {
glog.V(4).Infof("%s %d CHECK %d", strings.ToUpper(it.Type().String()), it.UID(), val)
glog.V(4).Infof("%s %d CHECK CONTAINS %d", strings.ToUpper(it.Type().String()), it.UID(), val)
}
}
func CheckLogOut(it Iterator, val Value, good bool) bool {
func ContainsLogOut(it Iterator, val Value, good bool) bool {
if glog.V(4) {
if good {
glog.V(4).Infof("%s %d CHECK %d GOOD", strings.ToUpper(it.Type().String()), it.UID(), val)
glog.V(4).Infof("%s %d CHECK CONTAINS %d GOOD", strings.ToUpper(it.Type().String()), it.UID(), val)
} else {
glog.V(4).Infof("%s %d CHECK %d BAD", strings.ToUpper(it.Type().String()), it.UID(), val)
glog.V(4).Infof("%s %d CHECK CONTAINS %d BAD", strings.ToUpper(it.Type().String()), it.UID(), val)
}
}
return good

View file

@ -31,19 +31,25 @@ import (
// An All iterator across a range of int64 values, from `max` to `min`.
type Int64 struct {
Base
uid uint64
tags graph.Tagger
max, min int64
at int64
result graph.Value
}
// Creates a new Int64 with the given range.
func NewInt64(min, max int64) *Int64 {
var all Int64
BaseInit(&all.Base)
all.max = max
all.min = min
all.at = min
return &all
return &Int64{
uid: NextUID(),
min: min,
max: max,
at: min,
}
}
func (it *Int64) UID() uint64 {
return it.uid
}
// Start back at the beginning
@ -55,13 +61,28 @@ func (it *Int64) Close() {}
func (it *Int64) Clone() graph.Iterator {
out := NewInt64(it.min, it.max)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
func (it *Int64) Tagger() *graph.Tagger {
return &it.tags
}
// Fill the map based on the tags assigned to this iterator.
func (it *Int64) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
// Prints the All iterator as just an "all".
func (it *Int64) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags: %v)", strings.Repeat(" ", indent), it.Type(), it.Tags())
return fmt.Sprintf("%s(%s tags: %v)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags())
}
// Next() on an Int64 all iterator is a simple incrementing counter.
@ -76,10 +97,28 @@ func (it *Int64) Next() (graph.Value, bool) {
if it.at > it.max {
it.at = -1
}
it.Last = val
it.result = val
return graph.NextLogOut(it, val, true)
}
// DEPRECATED
func (it *Int64) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Int64) Result() graph.Value {
return it.result
}
func (it *Int64) NextResult() bool {
return false
}
// No sub-iterators.
func (it *Int64) SubIterators() []graph.Iterator {
return nil
}
// The number of elements in an Int64 is the size of the range.
// The size is exact.
func (it *Int64) Size() (int64, bool) {
@ -87,16 +126,16 @@ func (it *Int64) Size() (int64, bool) {
return Size, true
}
// Check() for an Int64 is merely seeing if the passed value is
// Contains() for an Int64 is merely seeing if the passed value is
// withing the range, assuming the value is an int64.
func (it *Int64) Check(tsv graph.Value) bool {
graph.CheckLogIn(it, tsv)
func (it *Int64) Contains(tsv graph.Value) bool {
graph.ContainsLogIn(it, tsv)
v := tsv.(int64)
if it.min <= v && v <= it.max {
it.Last = v
return graph.CheckLogOut(it, v, true)
it.result = v
return graph.ContainsLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)
return graph.ContainsLogOut(it, v, false)
}
// The type of this iterator is an "all". This is important, as it puts it in
@ -111,8 +150,8 @@ func (it *Int64) Optimize() (graph.Iterator, bool) { return it, false }
func (it *Int64) Stats() graph.IteratorStats {
s, _ := it.Size()
return graph.IteratorStats{
CheckCost: 1,
NextCost: 1,
Size: s,
ContainsCost: 1,
NextCost: 1,
Size: s,
}
}

View file

@ -6,11 +6,11 @@
//
// It accomplishes this in one of two ways. If it is a Next()ed iterator (that
// is, it is a top level iterator, or on the "Next() path", then it will Next()
// it's primary iterator (helpfully, and.primary_it) and Check() the resultant
// it's primary iterator (helpfully, and.primary_it) and Contains() the resultant
// value against it's other iterators. If it matches all of them, then it
// returns that value. Otherwise, it repeats the process.
//
// If it's on a Check() path, it merely Check()s every iterator, and returns the
// If it's on a Contains() path, it merely Contains()s every iterator, and returns the
// logical AND of each result.
package iterator
@ -22,23 +22,28 @@ import (
"github.com/google/cayley/graph"
)
// The And iterator. Consists of a Base and a number of subiterators, the primary of which will
// The And iterator. Consists of a number of subiterators, the primary of which will
// be Next()ed if next is called.
type And struct {
Base
uid uint64
tags graph.Tagger
internalIterators []graph.Iterator
itCount int
primaryIt graph.Iterator
checkList []graph.Iterator
result graph.Value
}
// Creates a new And iterator.
func NewAnd() *And {
var and And
BaseInit(&and.Base)
and.internalIterators = make([]graph.Iterator, 0, 20)
and.checkList = nil
return &and
return &And{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
}
}
func (it *And) UID() uint64 {
return it.uid
}
// Reset all internal iterators
@ -50,15 +55,38 @@ func (it *And) Reset() {
it.checkList = nil
}
func (it *And) Tagger() *graph.Tagger {
return &it.tags
}
// An extended TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *And) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
if it.primaryIt != nil {
it.primaryIt.TagResults(dst)
}
for _, sub := range it.internalIterators {
sub.TagResults(dst)
}
}
func (it *And) Clone() graph.Iterator {
and := NewAnd()
and.AddSubIterator(it.primaryIt.Clone())
and.CopyTagsFrom(it)
and.tags.CopyFrom(it)
for _, sub := range it.internalIterators {
and.AddSubIterator(sub.Clone())
}
if it.checkList != nil {
and.optimizeCheck()
and.optimizeContains()
}
return and
}
@ -71,18 +99,6 @@ func (it *And) SubIterators() []graph.Iterator {
return iters
}
// Overrides Base TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *And) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
if it.primaryIt != nil {
it.primaryIt.TagResults(dst)
}
for _, sub := range it.internalIterators {
sub.TagResults(dst)
}
}
// DEPRECATED Returns the ResultTree for this iterator, recurses to it's subiterators.
func (it *And) ResultTree() *graph.ResultTree {
tree := graph.NewResultTree(it.Result())
@ -101,7 +117,7 @@ func (it *And) DebugString(indent int) string {
total += fmt.Sprintf("%d:\n%s\n", i, sub.DebugString(indent+4))
}
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
spaces := strings.Repeat(" ", indent+2)
@ -144,23 +160,27 @@ func (it *And) Next() (graph.Value, bool) {
var curr graph.Value
var exists bool
for {
curr, exists = it.primaryIt.Next()
curr, exists = graph.Next(it.primaryIt)
if !exists {
return graph.NextLogOut(it, nil, false)
}
if it.checkSubIts(curr) {
it.Last = curr
if it.subItsContain(curr) {
it.result = curr
return graph.NextLogOut(it, curr, true)
}
}
panic("Somehow broke out of Next() loop in And")
panic("unreachable")
}
func (it *And) Result() graph.Value {
return it.result
}
// Checks a value against the non-primary iterators, in order.
func (it *And) checkSubIts(val graph.Value) bool {
func (it *And) subItsContain(val graph.Value) bool {
var subIsGood = true
for _, sub := range it.internalIterators {
subIsGood = sub.Check(val)
subIsGood = sub.Contains(val)
if !subIsGood {
break
}
@ -168,36 +188,36 @@ func (it *And) checkSubIts(val graph.Value) bool {
return subIsGood
}
func (it *And) checkCheckList(val graph.Value) bool {
func (it *And) checkContainsList(val graph.Value) bool {
ok := true
for _, c := range it.checkList {
ok = c.Check(val)
ok = c.Contains(val)
if !ok {
break
}
}
if ok {
it.Last = val
it.result = val
}
return graph.CheckLogOut(it, val, ok)
return graph.ContainsLogOut(it, val, ok)
}
// Check a value against the entire iterator, in order.
func (it *And) Check(val graph.Value) bool {
graph.CheckLogIn(it, val)
func (it *And) Contains(val graph.Value) bool {
graph.ContainsLogIn(it, val)
if it.checkList != nil {
return it.checkCheckList(val)
return it.checkContainsList(val)
}
mainGood := it.primaryIt.Check(val)
mainGood := it.primaryIt.Contains(val)
if !mainGood {
return graph.CheckLogOut(it, val, false)
return graph.ContainsLogOut(it, val, false)
}
othersGood := it.checkSubIts(val)
othersGood := it.subItsContain(val)
if !othersGood {
return graph.CheckLogOut(it, val, false)
return graph.ContainsLogOut(it, val, false)
}
it.Last = val
return graph.CheckLogOut(it, val, true)
it.result = val
return graph.ContainsLogOut(it, val, true)
}
// Returns the approximate size of the And iterator. Because we're dealing

View file

@ -38,10 +38,10 @@ import (
// In short, tread lightly.
// Optimizes the And, by picking the most efficient way to Next() and
// Check() its subiterators. For SQL fans, this is equivalent to JOIN.
// Contains() its subiterators. For SQL fans, this is equivalent to JOIN.
func (it *And) Optimize() (graph.Iterator, bool) {
// First, let's get the slice of iterators, in order (first one is Next()ed,
// the rest are Check()ed)
// the rest are Contains()ed)
old := it.SubIterators()
// And call Optimize() on our subtree, replacing each one in the order we
@ -82,9 +82,9 @@ func (it *And) Optimize() (graph.Iterator, bool) {
}
// Move the tags hanging on us (like any good replacement).
newAnd.CopyTagsFrom(it)
newAnd.tags.CopyFrom(it)
newAnd.optimizeCheck()
newAnd.optimizeContains()
// And close ourselves but not our subiterators -- some may still be alive in
// the new And (they were unchanged upon calling Optimize() on them, at the
@ -142,24 +142,24 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
// Find the iterator with the projected "best" total cost.
// Total cost is defined as The Next()ed iterator's cost to Next() out
// all of it's contents, and to Check() each of those against everyone
// all of it's contents, and to Contains() each of those against everyone
// else.
for _, it := range its {
if !it.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
bad = append(bad, it)
continue
}
rootStats := it.Stats()
cost := rootStats.NextCost
for _, f := range its {
if !f.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
continue
}
if f == it {
continue
}
stats := f.Stats()
cost += stats.CheckCost
cost += stats.ContainsCost
}
cost *= rootStats.Size
if cost < bestCost {
@ -169,7 +169,7 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
}
// TODO(barakmich): Optimization of order need not stop here. Picking a smart
// Check() order based on probability of getting a false Check() first is
// Contains() order based on probability of getting a false Contains() first is
// useful (fail faster).
// Put the best iterator (the one we wish to Next()) at the front...
@ -177,7 +177,7 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
// ... push everyone else after...
for _, it := range its {
if !it.CanNext() {
if _, canNext := it.(graph.Nexter); !canNext {
continue
}
if it != best {
@ -192,12 +192,12 @@ func optimizeOrder(its []graph.Iterator) []graph.Iterator {
type byCost []graph.Iterator
func (c byCost) Len() int { return len(c) }
func (c byCost) Less(i, j int) bool { return c[i].Stats().CheckCost < c[j].Stats().CheckCost }
func (c byCost) Less(i, j int) bool { return c[i].Stats().ContainsCost < c[j].Stats().ContainsCost }
func (c byCost) Swap(i, j int) { c[i], c[j] = c[j], c[i] }
// optimizeCheck(l) creates an alternate check list, containing the same contents
// optimizeContains() creates an alternate check list, containing the same contents
// but with a new ordering, however it wishes.
func (it *And) optimizeCheck() {
func (it *And) optimizeContains() {
// GetSubIterators allocates, so this is currently safe.
// TODO(kortschak) Reuse it.checkList if possible.
// This involves providing GetSubIterators with a slice to fill.
@ -213,11 +213,11 @@ func (it *And) optimizeCheck() {
func (it *And) getSubTags() map[string]struct{} {
tags := make(map[string]struct{})
for _, sub := range it.SubIterators() {
for _, tag := range sub.Tags() {
for _, tag := range sub.Tagger().Tags() {
tags[tag] = struct{}{}
}
}
for _, tag := range it.Tags() {
for _, tag := range it.tags.Tags() {
tags[tag] = struct{}{}
}
return tags
@ -227,13 +227,14 @@ func (it *And) getSubTags() map[string]struct{} {
// src itself, and moves them to dst.
func moveTagsTo(dst graph.Iterator, src *And) {
tags := src.getSubTags()
for _, tag := range dst.Tags() {
for _, tag := range dst.Tagger().Tags() {
if _, ok := tags[tag]; ok {
delete(tags, tag)
}
}
dt := dst.Tagger()
for k := range tags {
dst.AddTag(k)
dt.Add(k)
}
}
@ -297,21 +298,21 @@ func hasOneUsefulIterator(its []graph.Iterator) graph.Iterator {
// For now, however, it's pretty static.
func (it *And) Stats() graph.IteratorStats {
primaryStats := it.primaryIt.Stats()
CheckCost := primaryStats.CheckCost
ContainsCost := primaryStats.ContainsCost
NextCost := primaryStats.NextCost
Size := primaryStats.Size
for _, sub := range it.internalIterators {
stats := sub.Stats()
NextCost += stats.CheckCost
CheckCost += stats.CheckCost
NextCost += stats.ContainsCost
ContainsCost += stats.ContainsCost
if Size > stats.Size {
Size = stats.Size
}
}
return graph.IteratorStats{
CheckCost: CheckCost,
NextCost: NextCost,
Size: Size,
ContainsCost: ContainsCost,
NextCost: NextCost,
Size: Size,
}
}

View file

@ -32,9 +32,9 @@ func TestIteratorPromotion(t *testing.T) {
a := NewAnd()
a.AddSubIterator(all)
a.AddSubIterator(fixed)
all.AddTag("a")
fixed.AddTag("b")
a.AddTag("c")
all.Tagger().Add("a")
fixed.Tagger().Add("b")
a.Tagger().Add("c")
newIt, changed := a.Optimize()
if !changed {
t.Error("Iterator didn't optimize")
@ -43,7 +43,7 @@ func TestIteratorPromotion(t *testing.T) {
t.Error("Expected fixed iterator")
}
tagsExpected := []string{"a", "b", "c"}
tags := newIt.Tags()
tags := newIt.Tagger().Tags()
sort.Strings(tags)
if !reflect.DeepEqual(tags, tagsExpected) {
t.Fatal("Tags don't match")
@ -67,9 +67,9 @@ func TestNullIteratorAnd(t *testing.T) {
func TestReorderWithTag(t *testing.T) {
all := NewInt64(100, 300)
all.AddTag("good")
all.Tagger().Add("good")
all2 := NewInt64(1, 30000)
all2.AddTag("slow")
all2.Tagger().Add("slow")
a := NewAnd()
// Make all2 the default iterator
a.AddSubIterator(all2)
@ -82,7 +82,7 @@ func TestReorderWithTag(t *testing.T) {
expectedTags := []string{"good", "slow"}
tagsOut := make([]string, 0)
for _, sub := range newIt.SubIterators() {
for _, x := range sub.Tags() {
for _, x := range sub.Tagger().Tags() {
tagsOut = append(tagsOut, x)
}
}
@ -93,9 +93,9 @@ func TestReorderWithTag(t *testing.T) {
func TestAndStatistics(t *testing.T) {
all := NewInt64(100, 300)
all.AddTag("good")
all.Tagger().Add("good")
all2 := NewInt64(1, 30000)
all2.AddTag("slow")
all2.Tagger().Add("slow")
a := NewAnd()
// Make all2 the default iterator
a.AddSubIterator(all2)

View file

@ -24,11 +24,11 @@ import (
func TestTag(t *testing.T) {
fix1 := newFixed()
fix1.Add(234)
fix1.AddTag("foo")
fix1.Tagger().Add("foo")
and := NewAnd()
and.AddSubIterator(fix1)
and.AddTag("bar")
out := fix1.Tags()
and.Tagger().Add("bar")
out := fix1.Tagger().Tags()
if len(out) != 1 {
t.Errorf("Expected length 1, got %d", len(out))
}

View file

@ -30,10 +30,12 @@ import (
// A Fixed iterator consists of it's values, an index (where it is in the process of Next()ing) and
// an equality function.
type Fixed struct {
Base
uid uint64
tags graph.Tagger
values []graph.Value
lastIndex int
cmp Equality
result graph.Value
}
// Define the signature of an equality function.
@ -54,12 +56,15 @@ func newFixed() *Fixed {
// Creates a new Fixed iterator with a custom comparitor.
func NewFixedIteratorWithCompare(compareFn Equality) *Fixed {
var it Fixed
BaseInit(&it.Base)
it.values = make([]graph.Value, 0, 20)
it.lastIndex = 0
it.cmp = compareFn
return &it
return &Fixed{
uid: NextUID(),
values: make([]graph.Value, 0, 20),
cmp: compareFn,
}
}
func (it *Fixed) UID() uint64 {
return it.uid
}
func (it *Fixed) Reset() {
@ -68,12 +73,26 @@ func (it *Fixed) Reset() {
func (it *Fixed) Close() {}
func (it *Fixed) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Fixed) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Fixed) Clone() graph.Iterator {
out := NewFixedIteratorWithCompare(it.cmp)
for _, val := range it.values {
out.Add(val)
}
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
@ -92,7 +111,7 @@ func (it *Fixed) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags: %s Size: %d id0: %d)",
strings.Repeat(" ", indent),
it.Type(),
it.FixedTags(),
it.tags.Fixed(),
len(it.values),
value,
)
@ -102,18 +121,18 @@ func (it *Fixed) DebugString(indent int) string {
func (it *Fixed) Type() graph.Type { return graph.Fixed }
// Check if the passed value is equal to one of the values stored in the iterator.
func (it *Fixed) Check(v graph.Value) bool {
func (it *Fixed) Contains(v graph.Value) bool {
// Could be optimized by keeping it sorted or using a better datastructure.
// However, for fixed iterators, which are by definition kind of tiny, this
// isn't a big issue.
graph.CheckLogIn(it, v)
graph.ContainsLogIn(it, v)
for _, x := range it.values {
if it.cmp(x, v) {
it.Last = x
return graph.CheckLogOut(it, v, true)
it.result = x
return graph.ContainsLogOut(it, v, true)
}
}
return graph.CheckLogOut(it, v, false)
return graph.ContainsLogOut(it, v, false)
}
// Return the next stored value from the iterator.
@ -123,11 +142,29 @@ func (it *Fixed) Next() (graph.Value, bool) {
return graph.NextLogOut(it, nil, false)
}
out := it.values[it.lastIndex]
it.Last = out
it.result = out
it.lastIndex++
return graph.NextLogOut(it, out, true)
}
// DEPRECATED
func (it *Fixed) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Fixed) Result() graph.Value {
return it.result
}
func (it *Fixed) NextResult() bool {
return false
}
// No sub-iterators.
func (it *Fixed) SubIterators() []graph.Iterator {
return nil
}
// Optimize() for a Fixed iterator is simple. Returns a Null iterator if it's empty
// (so that other iterators upstream can treat this as null) or there is no
// optimization.
@ -144,12 +181,12 @@ func (it *Fixed) Size() (int64, bool) {
return int64(len(it.values)), true
}
// As we right now have to scan the entire list, Next and Check are linear with the
// As we right now have to scan the entire list, Next and Contains are linear with the
// size. However, a better data structure could remove these limits.
func (it *Fixed) Stats() graph.IteratorStats {
return graph.IteratorStats{
CheckCost: int64(len(it.values)),
NextCost: int64(len(it.values)),
Size: int64(len(it.values)),
ContainsCost: int64(len(it.values)),
NextCost: int64(len(it.values)),
Size: int64(len(it.values)),
}
}

View file

@ -23,10 +23,10 @@ package iterator
// path. That's okay -- in reality, it can be viewed as returning the value for
// a new triple, but to make logic much simpler, here we have the HasA.
//
// Likewise, it's important to think about Check()ing a HasA. When given a
// Likewise, it's important to think about Contains()ing a HasA. When given a
// value to check, it means "Check all predicates that have this value for your
// direction against the subiterator." This would imply that there's more than
// one possibility for the same Check()ed value. While we could return the
// one possibility for the same Contains()ed value. While we could return the
// number of options, it's simpler to return one, and then call NextResult()
// enough times to enumerate the options. (In fact, one could argue that the
// raison d'etre for NextResult() is this iterator).
@ -40,28 +40,35 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
// A HasA consists of a reference back to the graph.TripleStore that it references,
// a primary subiterator, a direction in which the triples for that subiterator point,
// and a temporary holder for the iterator generated on Check().
// and a temporary holder for the iterator generated on Contains().
type HasA struct {
Base
uid uint64
tags graph.Tagger
ts graph.TripleStore
primaryIt graph.Iterator
dir graph.Direction
dir quad.Direction
resultIt graph.Iterator
result graph.Value
}
// Construct a new HasA iterator, given the triple subiterator, and the triple
// direction for which it stands.
func NewHasA(ts graph.TripleStore, subIt graph.Iterator, d graph.Direction) *HasA {
var hasa HasA
BaseInit(&hasa.Base)
hasa.ts = ts
hasa.primaryIt = subIt
hasa.dir = d
return &hasa
func NewHasA(ts graph.TripleStore, subIt graph.Iterator, d quad.Direction) *HasA {
return &HasA{
uid: NextUID(),
ts: ts,
primaryIt: subIt,
dir: d,
}
}
func (it *HasA) UID() uint64 {
return it.uid
}
// Return our sole subiterator.
@ -76,14 +83,18 @@ func (it *HasA) Reset() {
}
}
func (it *HasA) Tagger() *graph.Tagger {
return &it.tags
}
func (it *HasA) Clone() graph.Iterator {
out := NewHasA(it.ts, it.primaryIt.Clone(), it.dir)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Direction accessor.
func (it *HasA) Direction() graph.Direction { return it.dir }
func (it *HasA) Direction() quad.Direction { return it.dir }
// Pass the Optimize() call along to the subiterator. If it becomes Null,
// then the HasA becomes Null (there are no triples that have any directions).
@ -100,7 +111,14 @@ func (it *HasA) Optimize() (graph.Iterator, bool) {
// Pass the TagResults down the chain.
func (it *HasA) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.primaryIt.TagResults(dst)
}
@ -114,7 +132,7 @@ func (it *HasA) ResultTree() *graph.ResultTree {
// Print some information about this iterator.
func (it *HasA) DebugString(indent int) string {
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
return fmt.Sprintf("%s(%s %d tags:%s direction:%s\n%s)", strings.Repeat(" ", indent), it.Type(), it.UID(), tags, it.dir, it.primaryIt.DebugString(indent+4))
@ -122,9 +140,9 @@ func (it *HasA) DebugString(indent int) string {
// Check a value against our internal iterator. In order to do this, we must first open a new
// iterator of "triples that have `val` in our direction", given to us by the triple store,
// and then Next() values out of that iterator and Check() them against our subiterator.
func (it *HasA) Check(val graph.Value) bool {
graph.CheckLogIn(it, val)
// and then Next() values out of that iterator and Contains() them against our subiterator.
func (it *HasA) Contains(val graph.Value) bool {
graph.ContainsLogIn(it, val)
if glog.V(4) {
glog.V(4).Infoln("Id is", it.ts.NameOf(val))
}
@ -133,23 +151,23 @@ func (it *HasA) Check(val graph.Value) bool {
it.resultIt.Close()
}
it.resultIt = it.ts.TripleIterator(it.dir, val)
return graph.CheckLogOut(it, val, it.GetCheckResult())
return graph.ContainsLogOut(it, val, it.NextContains())
}
// GetCheckResult() is shared code between Check() and GetNextResult() -- calls next on the
// NextContains() is shared code between Contains() and GetNextResult() -- calls next on the
// result iterator (a triple iterator based on the last checked value) and returns true if
// another match is made.
func (it *HasA) GetCheckResult() bool {
func (it *HasA) NextContains() bool {
for {
linkVal, ok := it.resultIt.Next()
linkVal, ok := graph.Next(it.resultIt)
if !ok {
break
}
if glog.V(4) {
glog.V(4).Infoln("Triple is", it.ts.Triple(linkVal))
glog.V(4).Infoln("Quad is", it.ts.Quad(linkVal))
}
if it.primaryIt.Check(linkVal) {
it.Last = it.ts.TripleDirection(linkVal, it.dir)
if it.primaryIt.Contains(linkVal) {
it.result = it.ts.TripleDirection(linkVal, it.dir)
return true
}
}
@ -160,17 +178,17 @@ func (it *HasA) GetCheckResult() bool {
func (it *HasA) NextResult() bool {
// Order here is important. If the subiterator has a NextResult, then we
// need do nothing -- there is a next result, and we shouldn't move forward.
// However, we then need to get the next result from our last Check().
// However, we then need to get the next result from our last Contains().
//
// The upshot is, the end of NextResult() bubbles up from the bottom of the
// iterator tree up, and we need to respect that.
if it.primaryIt.NextResult() {
return true
}
return it.GetCheckResult()
return it.NextContains()
}
// Get the next result from this iterator. This is simpler than Check. We have a
// Get the next result from this iterator. This is simpler than Contains. We have a
// subiterator we can get a value from, and we can take that resultant triple,
// pull our direction out of it, and return that.
func (it *HasA) Next() (graph.Value, bool) {
@ -180,19 +198,23 @@ func (it *HasA) Next() (graph.Value, bool) {
}
it.resultIt = &Null{}
tID, ok := it.primaryIt.Next()
tID, ok := graph.Next(it.primaryIt)
if !ok {
return graph.NextLogOut(it, 0, false)
}
name := it.ts.Triple(tID).Get(it.dir)
name := it.ts.Quad(tID).Get(it.dir)
val := it.ts.ValueOf(name)
it.Last = val
it.result = val
return graph.NextLogOut(it, val, true)
}
func (it *HasA) Result() graph.Value {
return it.result
}
// GetStats() returns the statistics on the HasA iterator. This is curious. Next
// cost is easy, it's an extra call or so on top of the subiterator Next cost.
// CheckCost involves going to the graph.TripleStore, iterating out values, and hoping
// ContainsCost involves going to the graph.TripleStore, iterating out values, and hoping
// one sticks -- potentially expensive, depending on fanout. Size, however, is
// potentially smaller. we know at worst it's the size of the subiterator, but
// if there are many repeated values, it could be much smaller in totality.
@ -205,9 +227,9 @@ func (it *HasA) Stats() graph.IteratorStats {
nextConstant := int64(2)
tripleConstant := int64(1)
return graph.IteratorStats{
NextCost: tripleConstant + subitStats.NextCost,
CheckCost: (fanoutFactor * nextConstant) * subitStats.CheckCost,
Size: faninFactor * subitStats.Size,
NextCost: tripleConstant + subitStats.NextCost,
ContainsCost: (fanoutFactor * nextConstant) * subitStats.ContainsCost,
Size: faninFactor * subitStats.Size,
}
}
@ -221,3 +243,7 @@ func (it *HasA) Close() {
// Register this iterator as a HasA.
func (it *HasA) Type() graph.Type { return graph.HasA }
func (it *HasA) Size() (int64, bool) {
return 0, true
}

View file

@ -14,161 +14,55 @@
package iterator
// Define the general iterator interface, as well as the Base which all
// iterators can "inherit" from to get default iterator functionality.
// Define the general iterator interface.
import (
"fmt"
"strings"
"sync/atomic"
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
)
var nextIteratorID uintptr
var nextIteratorID uint64
func nextID() uintptr {
return atomic.AddUintptr(&nextIteratorID, 1) - 1
func NextUID() uint64 {
return atomic.AddUint64(&nextIteratorID, 1) - 1
}
// The Base iterator is the iterator other iterators inherit from to get some
// default functionality.
type Base struct {
Last graph.Value
tags []string
fixedTags map[string]graph.Value
canNext bool
uid uintptr
}
// Called by subclases.
func BaseInit(it *Base) {
// Your basic iterator is nextable
it.canNext = true
if glog.V(2) {
it.uid = nextID()
}
}
func (it *Base) UID() uintptr {
return it.uid
}
// Adds a tag to the iterator. Most iterators don't need to override.
func (it *Base) AddTag(tag string) {
if it.tags == nil {
it.tags = make([]string, 0)
}
it.tags = append(it.tags, tag)
}
func (it *Base) AddFixedTag(tag string, value graph.Value) {
if it.fixedTags == nil {
it.fixedTags = make(map[string]graph.Value)
}
it.fixedTags[tag] = value
}
// Returns the tags.
func (it *Base) Tags() []string {
return it.tags
}
func (it *Base) FixedTags() map[string]graph.Value {
return it.fixedTags
}
func (it *Base) CopyTagsFrom(other_it graph.Iterator) {
for _, tag := range other_it.Tags() {
it.AddTag(tag)
}
for k, v := range other_it.FixedTags() {
it.AddFixedTag(k, v)
}
}
// Prints a silly debug string. Most classes override.
func (it *Base) DebugString(indent int) string {
return fmt.Sprintf("%s(base)", strings.Repeat(" ", indent))
}
// Nothing in a base iterator.
func (it *Base) Check(v graph.Value) bool {
return false
}
// Base iterators should never appear in a tree if they are, select against
// them.
func (it *Base) Stats() graph.IteratorStats {
return graph.IteratorStats{100000, 100000, 100000}
}
// DEPRECATED
func (it *Base) ResultTree() *graph.ResultTree {
tree := graph.NewResultTree(it.Result())
return tree
}
// Nothing in a base iterator.
func (it *Base) Next() (graph.Value, bool) {
return nil, false
}
func (it *Base) NextResult() bool {
return false
}
// Returns the last result of an iterator.
func (it *Base) Result() graph.Value {
return it.Last
}
// If you're empty and you know it, clap your hands.
func (it *Base) Size() (int64, bool) {
return 0, true
}
// No subiterators. Only those with subiterators need to do anything here.
func (it *Base) SubIterators() []graph.Iterator {
return nil
}
// Accessor
func (it *Base) CanNext() bool { return it.canNext }
// Fill the map based on the tags assigned to this iterator. Default
// functionality works well for most iterators.
func (it *Base) TagResults(dst map[string]graph.Value) {
for _, tag := range it.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.FixedTags() {
dst[tag] = value
}
}
// Nothing to clean up.
// func (it *Base) Close() {}
func (it *Null) Close() {}
func (it *Base) Reset() {}
// Here we define the simplest base iterator -- the Null iterator. It contains nothing.
// Here we define the simplest iterator -- the Null iterator. It contains nothing.
// It is the empty set. Often times, queries that contain one of these match nothing,
// so it's important to give it a special iterator.
type Null struct {
Base
uid uint64
tags graph.Tagger
}
// Fairly useless New function.
func NewNull() *Null {
return &Null{}
return &Null{uid: NextUID()}
}
func (it *Null) UID() uint64 {
return it.uid
}
func (it *Null) Tagger() *graph.Tagger {
return &it.tags
}
// Fill the map based on the tags assigned to this iterator.
func (it *Null) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Null) Contains(graph.Value) bool {
return false
}
func (it *Null) Clone() graph.Iterator { return NewNull() }
@ -185,6 +79,34 @@ func (it *Null) DebugString(indent int) string {
return strings.Repeat(" ", indent) + "(null)"
}
func (it *Null) Next() (graph.Value, bool) {
return nil, false
}
func (it *Null) Result() graph.Value {
return nil
}
func (it *Null) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Null) SubIterators() []graph.Iterator {
return nil
}
func (it *Null) NextResult() bool {
return false
}
func (it *Null) Size() (int64, bool) {
return 0, true
}
func (it *Null) Reset() {}
func (it *Null) Close() {}
// A null iterator costs nothing. Use it!
func (it *Null) Stats() graph.IteratorStats {
return graph.IteratorStats{}

View file

@ -23,7 +23,7 @@ package iterator
// LinksTo is therefore sensitive to growing with a fanout. (A small-sized
// subiterator could cause LinksTo to be large).
//
// Check()ing a LinksTo means, given a link, take the direction we care about
// Contains()ing a LinksTo means, given a link, take the direction we care about
// and check if it's in our subiterator. Checking is therefore fairly cheap, and
// similar to checking the subiterator alone.
//
@ -34,29 +34,36 @@ import (
"strings"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
// A LinksTo has a reference back to the graph.TripleStore (to create the iterators
// for each node) the subiterator, and the direction the iterator comes from.
// `next_it` is the tempoarary iterator held per result in `primary_it`.
type LinksTo struct {
Base
uid uint64
tags graph.Tagger
ts graph.TripleStore
primaryIt graph.Iterator
dir graph.Direction
dir quad.Direction
nextIt graph.Iterator
result graph.Value
}
// Construct a new LinksTo iterator around a direction and a subiterator of
// nodes.
func NewLinksTo(ts graph.TripleStore, it graph.Iterator, d graph.Direction) *LinksTo {
var lto LinksTo
BaseInit(&lto.Base)
lto.ts = ts
lto.primaryIt = it
lto.dir = d
lto.nextIt = &Null{}
return &lto
func NewLinksTo(ts graph.TripleStore, it graph.Iterator, d quad.Direction) *LinksTo {
return &LinksTo{
uid: NextUID(),
ts: ts,
primaryIt: it,
dir: d,
nextIt: &Null{},
}
}
func (it *LinksTo) UID() uint64 {
return it.uid
}
func (it *LinksTo) Reset() {
@ -67,18 +74,29 @@ func (it *LinksTo) Reset() {
it.nextIt = &Null{}
}
func (it *LinksTo) Tagger() *graph.Tagger {
return &it.tags
}
func (it *LinksTo) Clone() graph.Iterator {
out := NewLinksTo(it.ts, it.primaryIt.Clone(), it.dir)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Return the direction under consideration.
func (it *LinksTo) Direction() graph.Direction { return it.dir }
func (it *LinksTo) Direction() quad.Direction { return it.dir }
// Tag these results, and our subiterator's results.
func (it *LinksTo) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.primaryIt.TagResults(dst)
}
@ -98,14 +116,14 @@ func (it *LinksTo) DebugString(indent int) string {
// If it checks in the right direction for the subiterator, it is a valid link
// for the LinksTo.
func (it *LinksTo) Check(val graph.Value) bool {
graph.CheckLogIn(it, val)
func (it *LinksTo) Contains(val graph.Value) bool {
graph.ContainsLogIn(it, val)
node := it.ts.TripleDirection(val, it.dir)
if it.primaryIt.Check(node) {
it.Last = val
return graph.CheckLogOut(it, val, true)
if it.primaryIt.Contains(node) {
it.result = val
return graph.ContainsLogOut(it, val, true)
}
return graph.CheckLogOut(it, val, false)
return graph.ContainsLogOut(it, val, false)
}
// Return a list containing only our subiterator.
@ -137,10 +155,10 @@ func (it *LinksTo) Optimize() (graph.Iterator, bool) {
// Next()ing a LinksTo operates as described above.
func (it *LinksTo) Next() (graph.Value, bool) {
graph.NextLogIn(it)
val, ok := it.nextIt.Next()
val, ok := graph.Next(it.nextIt)
if !ok {
// Subiterator is empty, get another one
candidate, ok := it.primaryIt.Next()
candidate, ok := graph.Next(it.primaryIt)
if !ok {
// We're out of nodes in our subiterator, so we're done as well.
return graph.NextLogOut(it, 0, false)
@ -150,10 +168,14 @@ func (it *LinksTo) Next() (graph.Value, bool) {
// Recurse -- return the first in the next set.
return it.Next()
}
it.Last = val
it.result = val
return graph.NextLogOut(it, val, ok)
}
func (it *LinksTo) Result() graph.Value {
return it.result
}
// Close our subiterators.
func (it *LinksTo) Close() {
it.nextIt.Close()
@ -176,8 +198,12 @@ func (it *LinksTo) Stats() graph.IteratorStats {
checkConstant := int64(1)
nextConstant := int64(2)
return graph.IteratorStats{
NextCost: nextConstant + subitStats.NextCost,
CheckCost: checkConstant + subitStats.CheckCost,
Size: fanoutFactor * subitStats.Size,
NextCost: nextConstant + subitStats.NextCost,
ContainsCost: checkConstant + subitStats.ContainsCost,
Size: fanoutFactor * subitStats.Size,
}
}
func (it *LinksTo) Size() (int64, bool) {
return 0, true
}

View file

@ -17,7 +17,7 @@ package iterator
import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
func TestLinksTo(t *testing.T) {
@ -32,12 +32,12 @@ func TestLinksTo(t *testing.T) {
t.Fatalf("Failed to return correct value, got:%v expect:1", val)
}
fixed.Add(val)
lto := NewLinksTo(ts, fixed, graph.Object)
lto := NewLinksTo(ts, fixed, quad.Object)
val, ok := lto.Next()
if !ok {
t.Error("At least one triple matches the fixed object")
}
if val != 2 {
t.Errorf("Triple index 2, such as %s, should match %s", ts.Triple(2), ts.Triple(val))
t.Errorf("Quad index 2, such as %s, should match %s", ts.Quad(2), ts.Quad(val))
}
}

View file

@ -17,15 +17,18 @@ package iterator
// A quickly mocked version of the TripleStore interface, for use in tests.
// Can better used Mock.Called but will fill in as needed.
import "github.com/google/cayley/graph"
import (
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
type store struct {
data []string
iter graph.Iterator
}
func (ts *store) ValueOf(s string) graph.Value {
for i, v := range ts.data {
func (qs *store) ValueOf(s string) graph.Value {
for i, v := range qs.data {
if s == v {
return i
}
@ -33,42 +36,42 @@ func (ts *store) ValueOf(s string) graph.Value {
return nil
}
func (ts *store) AddTriple(*graph.Triple) {}
func (qs *store) AddTriple(*quad.Quad) {}
func (ts *store) AddTripleSet([]*graph.Triple) {}
func (qs *store) AddTripleSet([]*quad.Quad) {}
func (ts *store) Triple(graph.Value) *graph.Triple { return &graph.Triple{} }
func (qs *store) Quad(graph.Value) *quad.Quad { return &quad.Quad{} }
func (ts *store) TripleIterator(d graph.Direction, i graph.Value) graph.Iterator {
return ts.iter
func (qs *store) TripleIterator(d quad.Direction, i graph.Value) graph.Iterator {
return qs.iter
}
func (ts *store) NodesAllIterator() graph.Iterator { return &Null{} }
func (qs *store) NodesAllIterator() graph.Iterator { return &Null{} }
func (ts *store) TriplesAllIterator() graph.Iterator { return &Null{} }
func (qs *store) TriplesAllIterator() graph.Iterator { return &Null{} }
func (ts *store) NameOf(v graph.Value) string {
func (qs *store) NameOf(v graph.Value) string {
i := v.(int)
if i < 0 || i >= len(ts.data) {
if i < 0 || i >= len(qs.data) {
return ""
}
return ts.data[i]
return qs.data[i]
}
func (ts *store) Size() int64 { return 0 }
func (qs *store) Size() int64 { return 0 }
func (ts *store) DebugPrint() {}
func (qs *store) DebugPrint() {}
func (ts *store) OptimizeIterator(it graph.Iterator) (graph.Iterator, bool) {
func (qs *store) OptimizeIterator(it graph.Iterator) (graph.Iterator, bool) {
return &Null{}, false
}
func (ts *store) FixedIterator() graph.FixedIterator {
func (qs *store) FixedIterator() graph.FixedIterator {
return NewFixedIteratorWithCompare(BasicEquality)
}
func (ts *store) Close() {}
func (qs *store) Close() {}
func (ts *store) TripleDirection(graph.Value, graph.Direction) graph.Value { return 0 }
func (qs *store) TripleDirection(graph.Value, quad.Direction) graph.Value { return 0 }
func (ts *store) RemoveTriple(t *graph.Triple) {}
func (qs *store) RemoveTriple(t *quad.Quad) {}

View file

@ -30,26 +30,31 @@ import (
"fmt"
"strings"
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
)
// An optional iterator has the subconstraint iterator we wish to be optional
// An optional iterator has the sub-constraint iterator we wish to be optional
// and whether the last check we received was true or false.
type Optional struct {
Base
uid uint64
tags graph.Tagger
subIt graph.Iterator
lastCheck bool
result graph.Value
}
// Creates a new optional iterator.
func NewOptional(it graph.Iterator) *Optional {
var o Optional
BaseInit(&o.Base)
o.canNext = false
o.subIt = it
return &o
return &Optional{
uid: NextUID(),
subIt: it,
}
}
func (it *Optional) CanNext() bool { return false }
func (it *Optional) UID() uint64 {
return it.uid
}
func (it *Optional) Reset() {
@ -61,17 +66,23 @@ func (it *Optional) Close() {
it.subIt.Close()
}
func (it *Optional) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Optional) Clone() graph.Iterator {
out := NewOptional(it.subIt.Clone())
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
// Nexting the iterator is unsupported -- error and return an empty set.
// (As above, a reasonable alternative would be to Next() an all iterator)
func (it *Optional) Next() (graph.Value, bool) {
glog.Errorln("Nexting an un-nextable iterator")
return nil, false
// DEPRECATED
func (it *Optional) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Optional) Result() graph.Value {
return it.result
}
// An optional iterator only has a next result if, (a) last time we checked
@ -84,13 +95,18 @@ func (it *Optional) NextResult() bool {
return false
}
// Check() is the real hack of this iterator. It always returns true, regardless
// No subiterators.
func (it *Optional) SubIterators() []graph.Iterator {
return nil
}
// Contains() is the real hack of this iterator. It always returns true, regardless
// of whether the subiterator matched. But we keep track of whether the subiterator
// matched for results purposes.
func (it *Optional) Check(val graph.Value) bool {
checked := it.subIt.Check(val)
func (it *Optional) Contains(val graph.Value) bool {
checked := it.subIt.Contains(val)
it.lastCheck = checked
it.Last = val
it.result = val
return true
}
@ -111,7 +127,7 @@ func (it *Optional) DebugString(indent int) string {
return fmt.Sprintf("%s(%s tags:%s\n%s)",
strings.Repeat(" ", indent),
it.Type(),
it.Tags(),
it.tags.Tags(),
it.subIt.DebugString(indent+4))
}
@ -130,8 +146,13 @@ func (it *Optional) Optimize() (graph.Iterator, bool) {
func (it *Optional) Stats() graph.IteratorStats {
subStats := it.subIt.Stats()
return graph.IteratorStats{
CheckCost: subStats.CheckCost,
NextCost: int64(1 << 62),
Size: subStats.Size,
ContainsCost: subStats.ContainsCost,
NextCost: int64(1 << 62),
Size: subStats.Size,
}
}
// If you're empty and you know it, clap your hands.
func (it *Optional) Size() (int64, bool) {
return 0, true
}

View file

@ -29,29 +29,34 @@ import (
)
type Or struct {
Base
uid uint64
tags graph.Tagger
isShortCircuiting bool
internalIterators []graph.Iterator
itCount int
currentIterator int
result graph.Value
}
func NewOr() *Or {
var or Or
BaseInit(&or.Base)
or.internalIterators = make([]graph.Iterator, 0, 20)
or.isShortCircuiting = false
or.currentIterator = -1
return &or
return &Or{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
currentIterator: -1,
}
}
func NewShortCircuitOr() *Or {
var or Or
BaseInit(&or.Base)
or.internalIterators = make([]graph.Iterator, 0, 20)
or.isShortCircuiting = true
or.currentIterator = -1
return &or
return &Or{
uid: NextUID(),
internalIterators: make([]graph.Iterator, 0, 20),
isShortCircuiting: true,
currentIterator: -1,
}
}
func (it *Or) UID() uint64 {
return it.uid
}
// Reset all internal iterators
@ -62,6 +67,10 @@ func (it *Or) Reset() {
it.currentIterator = -1
}
func (it *Or) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Or) Clone() graph.Iterator {
var or *Or
if it.isShortCircuiting {
@ -72,7 +81,7 @@ func (it *Or) Clone() graph.Iterator {
for _, sub := range it.internalIterators {
or.AddSubIterator(sub.Clone())
}
or.CopyTagsFrom(it)
or.tags.CopyFrom(it)
return or
}
@ -84,7 +93,14 @@ func (it *Or) SubIterators() []graph.Iterator {
// Overrides BaseIterator TagResults, as it needs to add it's own results and
// recurse down it's subiterators.
func (it *Or) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.internalIterators[it.currentIterator].TagResults(dst)
}
@ -105,7 +121,7 @@ func (it *Or) DebugString(indent int) string {
total += fmt.Sprintf("%d:\n%s\n", i, sub.DebugString(indent+4))
}
var tags string
for _, k := range it.Tags() {
for _, k := range it.tags.Tags() {
tags += fmt.Sprintf("%s;", k)
}
spaces := strings.Repeat(" ", indent+2)
@ -139,7 +155,7 @@ func (it *Or) Next() (graph.Value, bool) {
firstTime = true
}
curIt := it.internalIterators[it.currentIterator]
curr, exists = curIt.Next()
curr, exists = graph.Next(curIt)
if !exists {
if it.isShortCircuiting && !firstTime {
return graph.NextLogOut(it, nil, false)
@ -149,18 +165,22 @@ func (it *Or) Next() (graph.Value, bool) {
return graph.NextLogOut(it, nil, false)
}
} else {
it.Last = curr
it.result = curr
return graph.NextLogOut(it, curr, true)
}
}
panic("Somehow broke out of Next() loop in Or")
panic("unreachable")
}
func (it *Or) Result() graph.Value {
return it.result
}
// Checks a value against the iterators, in order.
func (it *Or) checkSubIts(val graph.Value) bool {
func (it *Or) subItsContain(val graph.Value) bool {
var subIsGood = false
for i, sub := range it.internalIterators {
subIsGood = sub.Check(val)
subIsGood = sub.Contains(val)
if subIsGood {
it.currentIterator = i
break
@ -170,14 +190,14 @@ func (it *Or) checkSubIts(val graph.Value) bool {
}
// Check a value against the entire graph.iterator, in order.
func (it *Or) Check(val graph.Value) bool {
graph.CheckLogIn(it, val)
anyGood := it.checkSubIts(val)
func (it *Or) Contains(val graph.Value) bool {
graph.ContainsLogIn(it, val)
anyGood := it.subItsContain(val)
if !anyGood {
return graph.CheckLogOut(it, val, false)
return graph.ContainsLogOut(it, val, false)
}
it.Last = val
return graph.CheckLogOut(it, val, true)
it.result = val
return graph.ContainsLogOut(it, val, true)
}
// Returns the approximate size of the Or graph.iterator. Because we're dealing
@ -247,7 +267,7 @@ func (it *Or) Optimize() (graph.Iterator, bool) {
}
// Move the tags hanging on us (like any good replacement).
newOr.CopyTagsFrom(it)
newOr.tags.CopyFrom(it)
// And close ourselves but not our subiterators -- some may still be alive in
// the new And (they were unchanged upon calling Optimize() on them, at the
@ -257,13 +277,13 @@ func (it *Or) Optimize() (graph.Iterator, bool) {
}
func (it *Or) Stats() graph.IteratorStats {
CheckCost := int64(0)
ContainsCost := int64(0)
NextCost := int64(0)
Size := int64(0)
for _, sub := range it.internalIterators {
stats := sub.Stats()
NextCost += stats.NextCost
CheckCost += stats.CheckCost
ContainsCost += stats.ContainsCost
if it.isShortCircuiting {
if Size < stats.Size {
Size = stats.Size
@ -273,9 +293,9 @@ func (it *Or) Stats() graph.IteratorStats {
}
}
return graph.IteratorStats{
CheckCost: CheckCost,
NextCost: NextCost,
Size: Size,
ContainsCost: ContainsCost,
NextCost: NextCost,
Size: Size,
}
}

View file

@ -24,7 +24,7 @@ import (
func iterated(it graph.Iterator) []int {
var res []int
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
@ -66,13 +66,13 @@ func TestOrIteratorBasics(t *testing.T) {
}
for _, v := range []int{2, 3, 21} {
if !or.Check(v) {
if !or.Contains(v) {
t.Errorf("Failed to correctly check %d as true", v)
}
}
for _, v := range []int{22, 5, 0} {
if or.Check(v) {
if or.Contains(v) {
t.Errorf("Failed to correctly check %d as false", v)
}
}
@ -125,12 +125,12 @@ func TestShortCircuitingOrBasics(t *testing.T) {
or.AddSubIterator(f1)
or.AddSubIterator(f2)
for _, v := range []int{2, 3, 21} {
if !or.Check(v) {
if !or.Contains(v) {
t.Errorf("Failed to correctly check %d as true", v)
}
}
for _, v := range []int{22, 5, 0} {
if or.Check(v) {
if or.Contains(v) {
t.Errorf("Failed to correctly check %d as false", v)
}
}

View file

@ -16,6 +16,7 @@ package iterator
import (
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
type Node struct {
@ -39,7 +40,7 @@ type queryShape struct {
ts graph.TripleStore
nodeId int
hasaIds []int
hasaDirs []graph.Direction
hasaDirs []quad.Direction
}
func OutputQueryShapeForIterator(it graph.Iterator, ts graph.TripleStore, outputMap map[string]interface{}) {
@ -62,11 +63,11 @@ func (qs *queryShape) AddLink(l *Link) {
qs.links = append(qs.links, *l)
}
func (qs *queryShape) LastHasa() (int, graph.Direction) {
func (qs *queryShape) LastHasa() (int, quad.Direction) {
return qs.hasaIds[len(qs.hasaIds)-1], qs.hasaDirs[len(qs.hasaDirs)-1]
}
func (qs *queryShape) PushHasa(i int, d graph.Direction) {
func (qs *queryShape) PushHasa(i int, d quad.Direction) {
qs.hasaIds = append(qs.hasaIds, i)
qs.hasaDirs = append(qs.hasaDirs, d)
}
@ -107,10 +108,10 @@ func (qs *queryShape) StealNode(left *Node, right *Node) {
func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
n := Node{Id: qs.nodeId}
for _, tag := range it.Tags() {
for _, tag := range it.Tagger().Tags() {
n.Tags = append(n.Tags, tag)
}
for k, _ := range it.FixedTags() {
for k, _ := range it.Tagger().Fixed() {
n.Tags = append(n.Tags, k)
}
@ -129,7 +130,7 @@ func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
case graph.Fixed:
n.IsFixed = true
for {
val, more := it.Next()
val, more := graph.Next(it)
if !more {
break
}
@ -159,10 +160,10 @@ func (qs *queryShape) MakeNode(it graph.Iterator) *Node {
qs.nodeId++
newNode := qs.MakeNode(lto.primaryIt)
hasaID, hasaDir := qs.LastHasa()
if (hasaDir == graph.Subject && lto.dir == graph.Object) ||
(hasaDir == graph.Object && lto.dir == graph.Subject) {
if (hasaDir == quad.Subject && lto.dir == quad.Object) ||
(hasaDir == quad.Object && lto.dir == quad.Subject) {
qs.AddNode(newNode)
if hasaDir == graph.Subject {
if hasaDir == quad.Subject {
qs.AddLink(&Link{hasaID, newNode.Id, 0, n.Id})
} else {
qs.AddLink(&Link{newNode.Id, hasaID, 0, n.Id})

View file

@ -19,6 +19,7 @@ import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
func hasaWithTag(ts graph.TripleStore, tag string, target string) *HasA {
@ -26,14 +27,14 @@ func hasaWithTag(ts graph.TripleStore, tag string, target string) *HasA {
obj := ts.FixedIterator()
obj.Add(ts.ValueOf(target))
obj.AddTag(tag)
and.AddSubIterator(NewLinksTo(ts, obj, graph.Object))
obj.Tagger().Add(tag)
and.AddSubIterator(NewLinksTo(ts, obj, quad.Object))
pred := ts.FixedIterator()
pred.Add(ts.ValueOf("status"))
and.AddSubIterator(NewLinksTo(ts, pred, graph.Predicate))
and.AddSubIterator(NewLinksTo(ts, pred, quad.Predicate))
return NewHasA(ts, and, graph.Subject)
return NewHasA(ts, and, quad.Subject)
}
func TestQueryShape(t *testing.T) {
@ -48,7 +49,7 @@ func TestQueryShape(t *testing.T) {
// Given a single linkage iterator's shape.
hasa := hasaWithTag(ts, "tag", "cool")
hasa.AddTag("top")
hasa.Tagger().Add("top")
shape := make(map[string]interface{})
OutputQueryShapeForIterator(hasa, ts, shape)
@ -93,22 +94,22 @@ func TestQueryShape(t *testing.T) {
andInternal := NewAnd()
hasa1 := hasaWithTag(ts, "tag1", "cool")
hasa1.AddTag("hasa1")
hasa1.Tagger().Add("hasa1")
andInternal.AddSubIterator(hasa1)
hasa2 := hasaWithTag(ts, "tag2", "fun")
hasa2.AddTag("hasa2")
hasa2.Tagger().Add("hasa2")
andInternal.AddSubIterator(hasa2)
pred := ts.FixedIterator()
pred.Add(ts.ValueOf("name"))
and := NewAnd()
and.AddSubIterator(NewLinksTo(ts, andInternal, graph.Subject))
and.AddSubIterator(NewLinksTo(ts, pred, graph.Predicate))
and.AddSubIterator(NewLinksTo(ts, andInternal, quad.Subject))
and.AddSubIterator(NewLinksTo(ts, pred, quad.Predicate))
shape = make(map[string]interface{})
OutputQueryShapeForIterator(NewHasA(ts, and, graph.Object), ts, shape)
OutputQueryShapeForIterator(NewHasA(ts, and, quad.Object), ts, shape)
links = shape["links"].([]Link)
if len(links) != 3 {

View file

@ -17,7 +17,7 @@ package iterator
// "Value Comparison" is a unary operator -- a filter across the values in the
// relevant subiterator.
//
// This is hugely useful for things like provenance, but value ranges in general
// This is hugely useful for things like label, but value ranges in general
// come up from time to time. At *worst* we're as big as our underlying iterator.
// At best, we're the null iterator.
//
@ -46,21 +46,27 @@ const (
)
type Comparison struct {
Base
subIt graph.Iterator
op Operator
val interface{}
ts graph.TripleStore
uid uint64
tags graph.Tagger
subIt graph.Iterator
op Operator
val interface{}
ts graph.TripleStore
result graph.Value
}
func NewComparison(sub graph.Iterator, op Operator, val interface{}, ts graph.TripleStore) *Comparison {
var vc Comparison
BaseInit(&vc.Base)
vc.subIt = sub
vc.op = op
vc.val = val
vc.ts = ts
return &vc
return &Comparison{
uid: NextUID(),
subIt: sub,
op: op,
val: val,
ts: ts,
}
}
func (it *Comparison) UID() uint64 {
return it.uid
}
// Here's the non-boilerplate part of the ValueComparison iterator. Given a value
@ -111,9 +117,13 @@ func (it *Comparison) Reset() {
it.subIt.Reset()
}
func (it *Comparison) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Comparison) Clone() graph.Iterator {
out := NewComparison(it.subIt.Clone(), it.op, it.val, it.ts)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
@ -121,7 +131,7 @@ func (it *Comparison) Next() (graph.Value, bool) {
var val graph.Value
var ok bool
for {
val, ok = it.subIt.Next()
val, ok = graph.Next(it.subIt)
if !ok {
return nil, false
}
@ -129,10 +139,19 @@ func (it *Comparison) Next() (graph.Value, bool) {
break
}
}
it.Last = val
it.result = val
return val, ok
}
// DEPRECATED
func (it *Comparison) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Comparison) Result() graph.Value {
return it.result
}
func (it *Comparison) NextResult() bool {
for {
hasNext := it.subIt.NextResult()
@ -143,21 +162,33 @@ func (it *Comparison) NextResult() bool {
return true
}
}
it.Last = it.subIt.Result()
it.result = it.subIt.Result()
return true
}
func (it *Comparison) Check(val graph.Value) bool {
// No subiterators.
func (it *Comparison) SubIterators() []graph.Iterator {
return nil
}
func (it *Comparison) Contains(val graph.Value) bool {
if !it.doComparison(val) {
return false
}
return it.subIt.Check(val)
return it.subIt.Contains(val)
}
// If we failed the check, then the subiterator should not contribute to the result
// set. Otherwise, go ahead and tag it.
func (it *Comparison) TagResults(dst map[string]graph.Value) {
it.Base.TagResults(dst)
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
it.subIt.TagResults(dst)
}
@ -188,3 +219,7 @@ func (it *Comparison) Optimize() (graph.Iterator, bool) {
func (it *Comparison) Stats() graph.IteratorStats {
return it.subIt.Stats()
}
func (it *Comparison) Size() (int64, bool) {
return 0, true
}

View file

@ -82,7 +82,7 @@ func TestValueComparison(t *testing.T) {
}
}
var vciCheckTests = []struct {
var vciContainsTests = []struct {
message string
operator Operator
check graph.Value
@ -114,10 +114,10 @@ var vciCheckTests = []struct {
},
}
func TestVCICheck(t *testing.T) {
for _, test := range vciCheckTests {
func TestVCIContains(t *testing.T) {
for _, test := range vciContainsTests {
vc := NewComparison(simpleFixedIterator(), test.operator, int64(2), simpleStore)
if vc.Check(test.check) != test.expect {
if vc.Contains(test.check) != test.expect {
t.Errorf("Failed to show %s", test.message)
}
}

View file

@ -24,36 +24,51 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type AllIterator struct {
iterator.Base
uid uint64
tags graph.Tagger
prefix []byte
dir graph.Direction
dir quad.Direction
open bool
iter ldbit.Iterator
ts *TripleStore
ro *opt.ReadOptions
result graph.Value
}
func NewAllIterator(prefix string, d graph.Direction, ts *TripleStore) *AllIterator {
var it AllIterator
iterator.BaseInit(&it.Base)
it.ro = &opt.ReadOptions{}
it.ro.DontFillCache = true
it.iter = ts.db.NewIterator(nil, it.ro)
it.prefix = []byte(prefix)
it.dir = d
it.open = true
it.ts = ts
func NewAllIterator(prefix string, d quad.Direction, ts *TripleStore) *AllIterator {
opts := &opt.ReadOptions{
DontFillCache: true,
}
it := AllIterator{
uid: iterator.NextUID(),
ro: opts,
iter: ts.db.NewIterator(nil, opts),
prefix: []byte(prefix),
dir: d,
open: true,
ts: ts,
}
it.iter.Seek(it.prefix)
if !it.iter.Valid() {
// FIXME(kortschak) What are the semantics here? Is this iterator usable?
// If not, we should return nil *Iterator and an error.
it.open = false
it.iter.Release()
}
return &it
}
func (it *AllIterator) UID() uint64 {
return it.uid
}
func (it *AllIterator) Reset() {
if !it.open {
it.iter = it.ts.db.NewIterator(nil, it.ro)
@ -66,15 +81,29 @@ func (it *AllIterator) Reset() {
}
}
func (it *AllIterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *AllIterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *AllIterator) Clone() graph.Iterator {
out := NewAllIterator(string(it.prefix), it.dir, it.ts)
out.CopyTagsFrom(it)
out.tags.CopyFrom(it)
return out
}
func (it *AllIterator) Next() (graph.Value, bool) {
if !it.open {
it.Last = nil
it.result = nil
return nil, false
}
var out []byte
@ -88,12 +117,29 @@ func (it *AllIterator) Next() (graph.Value, bool) {
it.Close()
return nil, false
}
it.Last = out
it.result = out
return out, true
}
func (it *AllIterator) Check(v graph.Value) bool {
it.Last = v
func (it *AllIterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *AllIterator) Result() graph.Value {
return it.result
}
func (it *AllIterator) NextResult() bool {
return false
}
// No subiterators.
func (it *AllIterator) SubIterators() []graph.Iterator {
return nil
}
func (it *AllIterator) Contains(v graph.Value) bool {
it.result = v
return true
}
@ -115,7 +161,7 @@ func (it *AllIterator) Size() (int64, bool) {
func (it *AllIterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s tags: %v leveldb size:%d %s %p)", strings.Repeat(" ", indent), it.Type(), it.Tags(), size, it.dir, it)
return fmt.Sprintf("%s(%s tags: %v leveldb size:%d %s %p)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags(), size, it.dir, it)
}
func (it *AllIterator) Type() graph.Type { return graph.All }
@ -128,8 +174,8 @@ func (it *AllIterator) Optimize() (graph.Iterator, bool) {
func (it *AllIterator) Stats() graph.IteratorStats {
s, _ := it.Size()
return graph.IteratorStats{
CheckCost: 1,
NextCost: 2,
Size: s,
ContainsCost: 1,
NextCost: 2,
Size: s,
}
}

View file

@ -24,45 +24,63 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type Iterator struct {
iterator.Base
uid uint64
tags graph.Tagger
nextPrefix []byte
checkId []byte
dir graph.Direction
dir quad.Direction
open bool
iter ldbit.Iterator
ts *TripleStore
qs *TripleStore
ro *opt.ReadOptions
originalPrefix string
result graph.Value
}
func NewIterator(prefix string, d graph.Direction, value graph.Value, ts *TripleStore) *Iterator {
var it Iterator
iterator.BaseInit(&it.Base)
it.checkId = value.([]byte)
it.dir = d
it.originalPrefix = prefix
it.nextPrefix = make([]byte, 0, 2+ts.hasher.Size())
it.nextPrefix = append(it.nextPrefix, []byte(prefix)...)
it.nextPrefix = append(it.nextPrefix, []byte(it.checkId[1:])...)
it.ro = &opt.ReadOptions{}
it.ro.DontFillCache = true
it.iter = ts.db.NewIterator(nil, it.ro)
it.open = true
it.ts = ts
func NewIterator(prefix string, d quad.Direction, value graph.Value, qs *TripleStore) *Iterator {
vb := value.([]byte)
p := make([]byte, 0, 2+qs.hasher.Size())
p = append(p, []byte(prefix)...)
p = append(p, []byte(vb[1:])...)
opts := &opt.ReadOptions{
DontFillCache: true,
}
it := Iterator{
uid: iterator.NextUID(),
nextPrefix: p,
checkId: vb,
dir: d,
originalPrefix: prefix,
ro: opts,
iter: qs.db.NewIterator(nil, opts),
open: true,
qs: qs,
}
ok := it.iter.Seek(it.nextPrefix)
if !ok {
// FIXME(kortschak) What are the semantics here? Is this iterator usable?
// If not, we should return nil *Iterator and an error.
it.open = false
it.iter.Release()
}
return &it
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
if !it.open {
it.iter = it.ts.db.NewIterator(nil, it.ro)
it.iter = it.qs.db.NewIterator(nil, it.ro)
it.open = true
}
ok := it.iter.Seek(it.nextPrefix)
@ -72,9 +90,23 @@ func (it *Iterator) Reset() {
}
}
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
out := NewIterator(it.originalPrefix, it.dir, it.checkId, it.ts)
out.CopyTagsFrom(it)
out := NewIterator(it.originalPrefix, it.dir, it.checkId, it.qs)
out.tags.CopyFrom(it)
return out
}
@ -87,22 +119,22 @@ func (it *Iterator) Close() {
func (it *Iterator) Next() (graph.Value, bool) {
if it.iter == nil {
it.Last = nil
it.result = nil
return nil, false
}
if !it.open {
it.Last = nil
it.result = nil
return nil, false
}
if !it.iter.Valid() {
it.Last = nil
it.result = nil
it.Close()
return nil, false
}
if bytes.HasPrefix(it.iter.Key(), it.nextPrefix) {
out := make([]byte, len(it.iter.Key()))
copy(out, it.iter.Key())
it.Last = out
it.result = out
ok := it.iter.Next()
if !ok {
it.Close()
@ -110,75 +142,92 @@ func (it *Iterator) Next() (graph.Value, bool) {
return out, true
}
it.Close()
it.Last = nil
it.result = nil
return nil, false
}
func PositionOf(prefix []byte, d graph.Direction, ts *TripleStore) int {
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func PositionOf(prefix []byte, d quad.Direction, qs *TripleStore) int {
if bytes.Equal(prefix, []byte("sp")) {
switch d {
case graph.Subject:
case quad.Subject:
return 2
case graph.Predicate:
return ts.hasher.Size() + 2
case graph.Object:
return 2*ts.hasher.Size() + 2
case graph.Provenance:
case quad.Predicate:
return qs.hasher.Size() + 2
case quad.Object:
return 2*qs.hasher.Size() + 2
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("po")) {
switch d {
case graph.Subject:
return 2*ts.hasher.Size() + 2
case graph.Predicate:
case quad.Subject:
return 2*qs.hasher.Size() + 2
case quad.Predicate:
return 2
case graph.Object:
return ts.hasher.Size() + 2
case graph.Provenance:
case quad.Object:
return qs.hasher.Size() + 2
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("os")) {
switch d {
case graph.Subject:
return ts.hasher.Size() + 2
case graph.Predicate:
return 2*ts.hasher.Size() + 2
case graph.Object:
case quad.Subject:
return qs.hasher.Size() + 2
case quad.Predicate:
return 2*qs.hasher.Size() + 2
case quad.Object:
return 2
case graph.Provenance:
case quad.Label:
return -1
}
}
if bytes.Equal(prefix, []byte("cp")) {
switch d {
case graph.Subject:
return 2*ts.hasher.Size() + 2
case graph.Predicate:
return ts.hasher.Size() + 2
case graph.Object:
return 3*ts.hasher.Size() + 2
case graph.Provenance:
case quad.Subject:
return 2*qs.hasher.Size() + 2
case quad.Predicate:
return qs.hasher.Size() + 2
case quad.Object:
return 3*qs.hasher.Size() + 2
case quad.Label:
return 2
}
}
panic("unreachable")
}
func (it *Iterator) Check(v graph.Value) bool {
func (it *Iterator) Contains(v graph.Value) bool {
val := v.([]byte)
if val[0] == 'z' {
return false
}
offset := PositionOf(val[0:2], it.dir, it.ts)
offset := PositionOf(val[0:2], it.dir, it.qs)
if offset != -1 {
if bytes.HasPrefix(val[offset:], it.checkId[1:]) {
return true
}
} else {
nameForDir := it.ts.Triple(v).Get(it.dir)
hashForDir := it.ts.ValueOf(nameForDir).([]byte)
nameForDir := it.qs.Quad(v).Get(it.dir)
hashForDir := it.qs.ValueOf(nameForDir).([]byte)
if bytes.Equal(hashForDir, it.checkId) {
return true
}
@ -187,12 +236,12 @@ func (it *Iterator) Check(v graph.Value) bool {
}
func (it *Iterator) Size() (int64, bool) {
return it.ts.SizeOf(it.checkId), true
return it.qs.SizeOf(it.checkId), true
}
func (it *Iterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s %d tags: %v dir: %s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.UID(), it.Tags(), it.dir, size, it.ts.NameOf(it.checkId))
return fmt.Sprintf("%s(%s %d tags: %v dir: %s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.UID(), it.tags.Tags(), it.dir, size, it.qs.NameOf(it.checkId))
}
var levelDBType graph.Type
@ -213,8 +262,8 @@ func (it *Iterator) Optimize() (graph.Iterator, bool) {
func (it *Iterator) Stats() graph.IteratorStats {
s, _ := it.Size()
return graph.IteratorStats{
CheckCost: 1,
NextCost: 2,
Size: s,
ContainsCost: 1,
NextCost: 2,
Size: s,
}
}

View file

@ -23,10 +23,11 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func makeTripleSet() []*graph.Triple {
tripleSet := []*graph.Triple{
func makeTripleSet() []*quad.Quad {
tripleSet := []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -42,20 +43,20 @@ func makeTripleSet() []*graph.Triple {
return tripleSet
}
func iteratedTriples(ts graph.TripleStore, it graph.Iterator) []*graph.Triple {
func iteratedTriples(qs graph.TripleStore, it graph.Iterator) []*quad.Quad {
var res ordered
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
res = append(res, ts.Triple(val))
res = append(res, qs.Quad(val))
}
sort.Sort(res)
return res
}
type ordered []*graph.Triple
type ordered []*quad.Quad
func (o ordered) Len() int { return len(o) }
func (o ordered) Less(i, j int) bool {
@ -72,7 +73,7 @@ func (o ordered) Less(i, j int) bool {
o[i].Subject == o[j].Subject &&
o[i].Predicate == o[j].Predicate &&
o[i].Object == o[j].Object &&
o[i].Provenance < o[j].Provenance:
o[i].Label < o[j].Label:
return true
@ -82,14 +83,14 @@ func (o ordered) Less(i, j int) bool {
}
func (o ordered) Swap(i, j int) { o[i], o[j] = o[j], o[i] }
func iteratedNames(ts graph.TripleStore, it graph.Iterator) []string {
func iteratedNames(qs graph.TripleStore, it graph.Iterator) []string {
var res []string
for {
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
res = append(res, ts.NameOf(val))
res = append(res, qs.NameOf(val))
}
sort.Strings(res)
return res
@ -107,14 +108,14 @@ func TestCreateDatabase(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
if s := ts.Size(); s != 0 {
if s := qs.Size(); s != 0 {
t.Errorf("Unexpected size, got:%d expected:0", s)
}
ts.Close()
qs.Close()
err = createNewLevelDB("/dev/null/some terrible path", nil)
if err == nil {
@ -137,53 +138,53 @@ func TestLoadDatabase(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTriple(&graph.Triple{"Something", "points_to", "Something Else", "context"})
qs.AddTriple(&quad.Quad{"Something", "points_to", "Something Else", "context"})
for _, pq := range []string{"Something", "points_to", "Something Else", "context"} {
if got := ts.NameOf(ts.ValueOf(pq)); got != pq {
if got := qs.NameOf(qs.ValueOf(pq)); got != pq {
t.Errorf("Failed to roundtrip %q, got:%q expect:%q", pq, got, pq)
}
}
if s := ts.Size(); s != 1 {
if s := qs.Size(); s != 1 {
t.Errorf("Unexpected triplestore size, got:%d expect:1", s)
}
ts.Close()
qs.Close()
err = createNewLevelDB(tmpDir, nil)
if err != nil {
t.Fatal("Failed to create LevelDB database.")
}
ts, err = newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err = newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts2, didConvert := ts.(*TripleStore)
ts2, didConvert := qs.(*TripleStore)
if !didConvert {
t.Errorf("Could not convert from generic to LevelDB TripleStore")
}
ts.AddTripleSet(makeTripleSet())
if s := ts.Size(); s != 11 {
qs.AddTripleSet(makeTripleSet())
if s := qs.Size(); s != 11 {
t.Errorf("Unexpected triplestore size, got:%d expect:11", s)
}
if s := ts2.SizeOf(ts.ValueOf("B")); s != 5 {
if s := ts2.SizeOf(qs.ValueOf("B")); s != 5 {
t.Errorf("Unexpected triplestore size, got:%d expect:5", s)
}
ts.RemoveTriple(&graph.Triple{"A", "follows", "B", ""})
if s := ts.Size(); s != 10 {
qs.RemoveTriple(&quad.Quad{"A", "follows", "B", ""})
if s := qs.Size(); s != 10 {
t.Errorf("Unexpected triplestore size after RemoveTriple, got:%d expect:10", s)
}
if s := ts2.SizeOf(ts.ValueOf("B")); s != 4 {
if s := ts2.SizeOf(qs.ValueOf("B")); s != 4 {
t.Errorf("Unexpected triplestore size, got:%d expect:4", s)
}
ts.Close()
qs.Close()
}
func TestIterator(t *testing.T) {
@ -199,14 +200,14 @@ func TestIterator(t *testing.T) {
t.Fatal("Failed to create LevelDB database.")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
var it graph.Iterator
it = ts.NodesAllIterator()
it = qs.NodesAllIterator()
if it == nil {
t.Fatal("Got nil iterator.")
}
@ -241,7 +242,7 @@ func TestIterator(t *testing.T) {
}
sort.Strings(expect)
for i := 0; i < 2; i++ {
got := iteratedNames(ts, it)
got := iteratedNames(qs, it)
sort.Strings(got)
if !reflect.DeepEqual(got, expect) {
t.Errorf("Unexpected iterated result on repeat %d, got:%v expect:%v", i, got, expect)
@ -250,23 +251,23 @@ func TestIterator(t *testing.T) {
}
for _, pq := range expect {
if !it.Check(ts.ValueOf(pq)) {
if !it.Contains(qs.ValueOf(pq)) {
t.Errorf("Failed to find and check %q correctly", pq)
}
}
// FIXME(kortschak) Why does this fail?
/*
for _, pq := range []string{"baller"} {
if it.Check(ts.ValueOf(pq)) {
if it.Contains(qs.ValueOf(pq)) {
t.Errorf("Failed to check %q correctly", pq)
}
}
*/
it.Reset()
it = ts.TriplesAllIterator()
edge, _ := it.Next()
triple := ts.Triple(edge)
it = qs.TriplesAllIterator()
edge, _ := graph.Next(it)
triple := qs.Quad(edge)
set := makeTripleSet()
var ok bool
for _, t := range set {
@ -279,7 +280,7 @@ func TestIterator(t *testing.T) {
t.Errorf("Failed to find %q during iteration, got:%q", triple, set)
}
ts.Close()
qs.Close()
}
func TestSetIterator(t *testing.T) {
@ -292,95 +293,95 @@ func TestSetIterator(t *testing.T) {
t.Fatalf("Failed to create working directory")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
defer ts.Close()
defer qs.Close()
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
expect := []*graph.Triple{
expect := []*quad.Quad{
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
}
sort.Sort(ordered(expect))
// Subject iterator.
it := ts.TripleIterator(graph.Subject, ts.ValueOf("C"))
it := qs.TripleIterator(quad.Subject, qs.ValueOf("C"))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results, got:%v expect:%v", got, expect)
}
it.Reset()
and := iterator.NewAnd()
and.AddSubIterator(ts.TriplesAllIterator())
and.AddSubIterator(qs.TriplesAllIterator())
and.AddSubIterator(it)
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
// Object iterator.
it = ts.TripleIterator(graph.Object, ts.ValueOf("F"))
it = qs.TripleIterator(quad.Object, qs.ValueOf("F"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "follows", "F", ""},
{"E", "follows", "F", ""},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results, got:%v expect:%v", got, expect)
}
and = iterator.NewAnd()
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
and.AddSubIterator(it)
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "follows", "F", ""},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
// Predicate iterator.
it = ts.TripleIterator(graph.Predicate, ts.ValueOf("status"))
it = qs.TripleIterator(quad.Predicate, qs.ValueOf("status"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
{"D", "status", "cool", "status_graph"},
{"G", "status", "cool", "status_graph"},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results from predicate iterator, got:%v expect:%v", got, expect)
}
// Provenance iterator.
it = ts.TripleIterator(graph.Provenance, ts.ValueOf("status_graph"))
// Label iterator.
it = qs.TripleIterator(quad.Label, qs.ValueOf("status_graph"))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
{"D", "status", "cool", "status_graph"},
{"G", "status", "cool", "status_graph"},
}
sort.Sort(ordered(expect))
if got := iteratedTriples(ts, it); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, it); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get expected results from predicate iterator, got:%v expect:%v", got, expect)
}
it.Reset()
// Order is important
and = iterator.NewAnd()
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
and.AddSubIterator(it)
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
it.Reset()
@ -388,12 +389,12 @@ func TestSetIterator(t *testing.T) {
// Order is important
and = iterator.NewAnd()
and.AddSubIterator(it)
and.AddSubIterator(ts.TripleIterator(graph.Subject, ts.ValueOf("B")))
and.AddSubIterator(qs.TripleIterator(quad.Subject, qs.ValueOf("B")))
expect = []*graph.Triple{
expect = []*quad.Quad{
{"B", "status", "cool", "status_graph"},
}
if got := iteratedTriples(ts, and); !reflect.DeepEqual(got, expect) {
if got := iteratedTriples(qs, and); !reflect.DeepEqual(got, expect) {
t.Errorf("Failed to get confirm expected results, got:%v expect:%v", got, expect)
}
}
@ -406,17 +407,17 @@ func TestOptimize(t *testing.T) {
if err != nil {
t.Fatalf("Failed to create working directory")
}
ts, err := newTripleStore(tmpDir, nil)
if ts == nil || err != nil {
qs, err := newTripleStore(tmpDir, nil)
if qs == nil || err != nil {
t.Error("Failed to create leveldb TripleStore.")
}
ts.AddTripleSet(makeTripleSet())
qs.AddTripleSet(makeTripleSet())
// With an linksto-fixed pair
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("F"))
fixed.AddTag("internal")
lto := iterator.NewLinksTo(ts, fixed, graph.Object)
fixed := qs.FixedIterator()
fixed.Add(qs.ValueOf("F"))
fixed.Tagger().Add("internal")
lto := iterator.NewLinksTo(qs, fixed, quad.Object)
oldIt := lto.Clone()
newIt, ok := lto.Optimize()
@ -427,16 +428,16 @@ func TestOptimize(t *testing.T) {
t.Errorf("Optimized iterator type does not match original, got:%v expect:%v", newIt.Type(), Type())
}
newTriples := iteratedTriples(ts, newIt)
oldTriples := iteratedTriples(ts, oldIt)
newTriples := iteratedTriples(qs, newIt)
oldTriples := iteratedTriples(qs, oldIt)
if !reflect.DeepEqual(newTriples, oldTriples) {
t.Errorf("Optimized iteration does not match original")
}
oldIt.Next()
graph.Next(oldIt)
oldResults := make(map[string]graph.Value)
oldIt.TagResults(oldResults)
newIt.Next()
graph.Next(newIt)
newResults := make(map[string]graph.Value)
newIt.TagResults(newResults)
if !reflect.DeepEqual(newResults, oldResults) {

File diff suppressed because it is too large Load diff

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
it.Close()
return newIt, true

View file

@ -31,6 +31,11 @@ func NewMemstoreAllIterator(ts *TripleStore) *AllIterator {
return &out
}
// No subiterators.
func (it *AllIterator) SubIterators() []graph.Iterator {
return nil
}
func (it *AllIterator) Next() (graph.Value, bool) {
next, out := it.Int64.Next()
if !out {
@ -41,6 +46,5 @@ func (it *AllIterator) Next() (graph.Value, bool) {
if !ok {
return it.Next()
}
it.Last = next
return next, out
}

View file

@ -26,11 +26,13 @@ import (
)
type Iterator struct {
iterator.Base
uid uint64
tags graph.Tagger
tree *llrb.LLRB
data string
isRunning bool
iterLast Int64
result graph.Value
}
type Int64 int64
@ -53,52 +55,87 @@ func IterateOne(tree *llrb.LLRB, last Int64) Int64 {
}
func NewLlrbIterator(tree *llrb.LLRB, data string) *Iterator {
var it Iterator
iterator.BaseInit(&it.Base)
it.tree = tree
it.iterLast = Int64(-1)
it.data = data
return &it
return &Iterator{
uid: iterator.NextUID(),
tree: tree,
iterLast: Int64(-1),
data: data,
}
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
it.iterLast = Int64(-1)
}
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
var new_it = NewLlrbIterator(it.tree, it.data)
new_it.CopyTagsFrom(it)
return new_it
m := NewLlrbIterator(it.tree, it.data)
m.tags.CopyFrom(it)
return m
}
func (it *Iterator) Close() {}
func (it *Iterator) Next() (graph.Value, bool) {
graph.NextLogIn(it)
if it.tree.Max() == nil || it.Last == int64(it.tree.Max().(Int64)) {
if it.tree.Max() == nil || it.result == int64(it.tree.Max().(Int64)) {
return graph.NextLogOut(it, nil, false)
}
it.iterLast = IterateOne(it.tree, it.iterLast)
it.Last = int64(it.iterLast)
return graph.NextLogOut(it, it.Last, true)
it.result = int64(it.iterLast)
return graph.NextLogOut(it, it.result, true)
}
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func (it *Iterator) Size() (int64, bool) {
return int64(it.tree.Len()), true
}
func (it *Iterator) Check(v graph.Value) bool {
graph.CheckLogIn(it, v)
func (it *Iterator) Contains(v graph.Value) bool {
graph.ContainsLogIn(it, v)
if it.tree.Has(Int64(v.(int64))) {
it.Last = v
return graph.CheckLogOut(it, v, true)
it.result = v
return graph.ContainsLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)
return graph.ContainsLogOut(it, v, false)
}
func (it *Iterator) DebugString(indent int) string {
size, _ := it.Size()
return fmt.Sprintf("%s(%s tags:%s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.Tags(), size, it.data)
return fmt.Sprintf("%s(%s tags:%s size:%d %s)", strings.Repeat(" ", indent), it.Type(), it.tags.Tags(), size, it.data)
}
var memType graph.Type
@ -119,8 +156,8 @@ func (it *Iterator) Optimize() (graph.Iterator, bool) {
func (it *Iterator) Stats() graph.IteratorStats {
return graph.IteratorStats{
CheckCost: int64(math.Log(float64(it.tree.Len()))) + 1,
NextCost: 1,
Size: int64(it.tree.Len()),
ContainsCost: int64(math.Log(float64(it.tree.Len()))) + 1,
NextCost: 1,
Size: int64(it.tree.Len()),
}
}

View file

@ -20,15 +20,22 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
"github.com/petar/GoLLRB/llrb"
)
func init() {
graph.RegisterTripleStore("memstore", func(string, graph.Options) (graph.TripleStore, error) {
return newTripleStore(), nil
}, nil)
}
type TripleDirectionIndex struct {
subject map[int64]*llrb.LLRB
predicate map[int64]*llrb.LLRB
object map[int64]*llrb.LLRB
provenance map[int64]*llrb.LLRB
subject map[int64]*llrb.LLRB
predicate map[int64]*llrb.LLRB
object map[int64]*llrb.LLRB
label map[int64]*llrb.LLRB
}
func NewTripleDirectionIndex() *TripleDirectionIndex {
@ -36,25 +43,25 @@ func NewTripleDirectionIndex() *TripleDirectionIndex {
tdi.subject = make(map[int64]*llrb.LLRB)
tdi.predicate = make(map[int64]*llrb.LLRB)
tdi.object = make(map[int64]*llrb.LLRB)
tdi.provenance = make(map[int64]*llrb.LLRB)
tdi.label = make(map[int64]*llrb.LLRB)
return &tdi
}
func (tdi *TripleDirectionIndex) GetForDir(d graph.Direction) map[int64]*llrb.LLRB {
func (tdi *TripleDirectionIndex) GetForDir(d quad.Direction) map[int64]*llrb.LLRB {
switch d {
case graph.Subject:
case quad.Subject:
return tdi.subject
case graph.Object:
case quad.Object:
return tdi.object
case graph.Predicate:
case quad.Predicate:
return tdi.predicate
case graph.Provenance:
return tdi.provenance
case quad.Label:
return tdi.label
}
panic("illegal direction")
}
func (tdi *TripleDirectionIndex) GetOrCreate(d graph.Direction, id int64) *llrb.LLRB {
func (tdi *TripleDirectionIndex) GetOrCreate(d quad.Direction, id int64) *llrb.LLRB {
directionIndex := tdi.GetForDir(d)
if _, ok := directionIndex[id]; !ok {
directionIndex[id] = llrb.New()
@ -62,7 +69,7 @@ func (tdi *TripleDirectionIndex) GetOrCreate(d graph.Direction, id int64) *llrb.
return directionIndex[id]
}
func (tdi *TripleDirectionIndex) Get(d graph.Direction, id int64) (*llrb.LLRB, bool) {
func (tdi *TripleDirectionIndex) Get(d quad.Direction, id int64) (*llrb.LLRB, bool) {
directionIndex := tdi.GetForDir(d)
tree, exists := directionIndex[id]
return tree, exists
@ -73,7 +80,7 @@ type TripleStore struct {
tripleIdCounter int64
idMap map[string]int64
revIdMap map[int64]string
triples []graph.Triple
triples []quad.Quad
size int64
index TripleDirectionIndex
// vip_index map[string]map[int64]map[string]map[int64]*llrb.Tree
@ -83,10 +90,10 @@ func newTripleStore() *TripleStore {
var ts TripleStore
ts.idMap = make(map[string]int64)
ts.revIdMap = make(map[int64]string)
ts.triples = make([]graph.Triple, 1, 200)
ts.triples = make([]quad.Quad, 1, 200)
// Sentinel null triple so triple indices start at 1
ts.triples[0] = graph.Triple{}
ts.triples[0] = quad.Quad{}
ts.size = 1
ts.index = *NewTripleDirectionIndex()
ts.idCounter = 1
@ -94,18 +101,18 @@ func newTripleStore() *TripleStore {
return &ts
}
func (ts *TripleStore) AddTripleSet(triples []*graph.Triple) {
func (ts *TripleStore) AddTripleSet(triples []*quad.Quad) {
for _, t := range triples {
ts.AddTriple(t)
}
}
func (ts *TripleStore) tripleExists(t *graph.Triple) (bool, int64) {
func (ts *TripleStore) tripleExists(t *quad.Quad) (bool, int64) {
smallest := -1
var smallest_tree *llrb.LLRB
for d := graph.Subject; d <= graph.Provenance; d++ {
for d := quad.Subject; d <= quad.Label; d++ {
sid := t.Get(d)
if d == graph.Provenance && sid == "" {
if d == quad.Label && sid == "" {
continue
}
id, ok := ts.idMap[sid]
@ -137,7 +144,7 @@ func (ts *TripleStore) tripleExists(t *graph.Triple) (bool, int64) {
return false, 0
}
func (ts *TripleStore) AddTriple(t *graph.Triple) {
func (ts *TripleStore) AddTriple(t *quad.Quad) {
if exists, _ := ts.tripleExists(t); exists {
return
}
@ -147,9 +154,9 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
ts.size++
ts.tripleIdCounter++
for d := graph.Subject; d <= graph.Provenance; d++ {
for d := quad.Subject; d <= quad.Label; d++ {
sid := t.Get(d)
if d == graph.Provenance && sid == "" {
if d == quad.Label && sid == "" {
continue
}
if _, ok := ts.idMap[sid]; !ok {
@ -159,8 +166,8 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
}
}
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id := ts.idMap[t.Get(d)]
@ -171,7 +178,7 @@ func (ts *TripleStore) AddTriple(t *graph.Triple) {
// TODO(barakmich): Add VIP indexing
}
func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
func (ts *TripleStore) RemoveTriple(t *quad.Quad) {
var tripleID int64
var exists bool
tripleID = 0
@ -179,11 +186,11 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
return
}
ts.triples[tripleID] = graph.Triple{}
ts.triples[tripleID] = quad.Quad{}
ts.size--
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id := ts.idMap[t.Get(d)]
@ -191,8 +198,8 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
tree.Delete(Int64(tripleID))
}
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
id, ok := ts.idMap[t.Get(d)]
@ -200,8 +207,8 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
continue
}
stillExists := false
for d := graph.Subject; d <= graph.Provenance; d++ {
if d == graph.Provenance && t.Get(d) == "" {
for d := quad.Subject; d <= quad.Label; d++ {
if d == quad.Label && t.Get(d) == "" {
continue
}
nodeTree := ts.index.GetOrCreate(d, id)
@ -217,11 +224,11 @@ func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
}
}
func (ts *TripleStore) Triple(index graph.Value) *graph.Triple {
func (ts *TripleStore) Quad(index graph.Value) *quad.Quad {
return &ts.triples[index.(int64)]
}
func (ts *TripleStore) TripleIterator(d graph.Direction, value graph.Value) graph.Iterator {
func (ts *TripleStore) TripleIterator(d quad.Direction, value graph.Value) graph.Iterator {
index, ok := ts.index.Get(d, value.(int64))
data := fmt.Sprintf("dir:%s val:%d", d, value.(int64))
if ok {
@ -239,7 +246,7 @@ func (ts *TripleStore) DebugPrint() {
if i == 0 {
continue
}
glog.V(2).Infoln("%d: %s", i, t)
glog.V(2).Infof("%d: %s", i, t)
}
}
@ -259,8 +266,8 @@ func (ts *TripleStore) FixedIterator() graph.FixedIterator {
return iterator.NewFixedIteratorWithCompare(iterator.BasicEquality)
}
func (ts *TripleStore) TripleDirection(val graph.Value, d graph.Direction) graph.Value {
name := ts.Triple(val).Get(d)
func (ts *TripleStore) TripleDirection(val graph.Value, d quad.Direction) graph.Value {
name := ts.Quad(val).Get(d)
return ts.ValueOf(name)
}
@ -268,9 +275,3 @@ func (ts *TripleStore) NodesAllIterator() graph.Iterator {
return NewMemstoreAllIterator(ts)
}
func (ts *TripleStore) Close() {}
func init() {
graph.RegisterTripleStore("memstore", func(string, graph.Options) (graph.TripleStore, error) {
return newTripleStore(), nil
}, nil)
}

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
return newIt, true
}

View file

@ -21,6 +21,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
// This is a simple test graph.
@ -36,7 +37,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +51,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestStore(data []*graph.Triple) (*TripleStore, []pair) {
func makeTestStore(data []*quad.Quad) (*TripleStore, []pair) {
seen := make(map[string]struct{})
ts := newTripleStore()
var (
@ -58,7 +59,7 @@ func makeTestStore(data []*graph.Triple) (*TripleStore, []pair) {
ind []pair
)
for _, t := range data {
for _, qp := range []string{t.Subject, t.Predicate, t.Object, t.Provenance} {
for _, qp := range []string{t.Subject, t.Predicate, t.Object, t.Label} {
if _, ok := seen[qp]; !ok && qp != "" {
val++
ind = append(ind, pair{qp, val})
@ -105,10 +106,10 @@ func TestIteratorsAndNextResultOrderA(t *testing.T) {
all := ts.NodesAllIterator()
innerAnd := iterator.NewAnd()
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, graph.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Object))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, quad.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Object))
hasa := iterator.NewHasA(ts, innerAnd, graph.Subject)
hasa := iterator.NewHasA(ts, innerAnd, quad.Subject)
outerAnd := iterator.NewAnd()
outerAnd.AddSubIterator(fixed)
outerAnd.AddSubIterator(hasa)
@ -149,8 +150,8 @@ func TestLinksToOptimization(t *testing.T) {
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("cool"))
lto := iterator.NewLinksTo(ts, fixed, graph.Object)
lto.AddTag("foo")
lto := iterator.NewLinksTo(ts, fixed, quad.Object)
lto.Tagger().Add("foo")
newIt, changed := lto.Optimize()
if !changed {
@ -165,7 +166,8 @@ func TestLinksToOptimization(t *testing.T) {
if v_clone.DebugString(0) != v.DebugString(0) {
t.Fatal("Wrong iterator. Got ", v_clone.DebugString(0))
}
if len(v_clone.Tags()) < 1 || v_clone.Tags()[0] != "foo" {
vt := v_clone.Tagger()
if len(vt.Tags()) < 1 || vt.Tags()[0] != "foo" {
t.Fatal("Tag on LinksTo did not persist")
}
}
@ -173,7 +175,7 @@ func TestLinksToOptimization(t *testing.T) {
func TestRemoveTriple(t *testing.T) {
ts, _ := makeTestStore(simpleGraph)
ts.RemoveTriple(&graph.Triple{"E", "follows", "F", ""})
ts.RemoveTriple(&quad.Quad{"E", "follows", "F", ""})
fixed := ts.FixedIterator()
fixed.Add(ts.ValueOf("E"))
@ -182,13 +184,13 @@ func TestRemoveTriple(t *testing.T) {
fixed2.Add(ts.ValueOf("follows"))
innerAnd := iterator.NewAnd()
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, graph.Subject))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, graph.Predicate))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, quad.Subject))
innerAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed2, quad.Predicate))
hasa := iterator.NewHasA(ts, innerAnd, graph.Object)
hasa := iterator.NewHasA(ts, innerAnd, quad.Object)
newIt, _ := hasa.Optimize()
_, ok := newIt.Next()
_, ok := graph.Next(newIt)
if ok {
t.Error("E should not have any followers.")
}

View file

@ -24,12 +24,14 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
type Iterator struct {
iterator.Base
ts *TripleStore
dir graph.Direction
uid uint64
tags graph.Tagger
qs *TripleStore
dir quad.Direction
iter *mgo.Iter
hash string
name string
@ -37,60 +39,73 @@ type Iterator struct {
isAll bool
constraint bson.M
collection string
result graph.Value
}
func NewIterator(ts *TripleStore, collection string, d graph.Direction, val graph.Value) *Iterator {
var m Iterator
iterator.BaseInit(&m.Base)
func NewIterator(qs *TripleStore, collection string, d quad.Direction, val graph.Value) *Iterator {
name := qs.NameOf(val)
m.name = ts.NameOf(val)
m.collection = collection
var constraint bson.M
switch d {
case graph.Subject:
m.constraint = bson.M{"Subject": m.name}
case graph.Predicate:
m.constraint = bson.M{"Predicate": m.name}
case graph.Object:
m.constraint = bson.M{"Object": m.name}
case graph.Provenance:
m.constraint = bson.M{"Provenance": m.name}
case quad.Subject:
constraint = bson.M{"Subject": name}
case quad.Predicate:
constraint = bson.M{"Predicate": name}
case quad.Object:
constraint = bson.M{"Object": name}
case quad.Label:
constraint = bson.M{"Label": name}
}
m.ts = ts
m.dir = d
m.iter = ts.db.C(collection).Find(m.constraint).Iter()
size, err := ts.db.C(collection).Find(m.constraint).Count()
size, err := qs.db.C(collection).Find(constraint).Count()
if err != nil {
// FIXME(kortschak) This should be passed back rather than just logging.
glog.Errorln("Trouble getting size for iterator! ", err)
return nil
}
m.size = int64(size)
m.hash = val.(string)
m.isAll = false
return &m
return &Iterator{
uid: iterator.NextUID(),
name: name,
constraint: constraint,
collection: collection,
qs: qs,
dir: d,
iter: qs.db.C(collection).Find(constraint).Iter(),
size: int64(size),
hash: val.(string),
isAll: false,
}
}
func NewAllIterator(ts *TripleStore, collection string) *Iterator {
var m Iterator
m.ts = ts
m.dir = graph.Any
m.constraint = nil
m.collection = collection
m.iter = ts.db.C(collection).Find(nil).Iter()
size, err := ts.db.C(collection).Count()
func NewAllIterator(qs *TripleStore, collection string) *Iterator {
size, err := qs.db.C(collection).Count()
if err != nil {
// FIXME(kortschak) This should be passed back rather than just logging.
glog.Errorln("Trouble getting size for iterator! ", err)
return nil
}
m.size = int64(size)
m.hash = ""
m.isAll = true
return &m
return &Iterator{
uid: iterator.NextUID(),
qs: qs,
dir: quad.Any,
constraint: nil,
collection: collection,
iter: qs.db.C(collection).Find(nil).Iter(),
size: int64(size),
hash: "",
isAll: true,
}
}
func (it *Iterator) UID() uint64 {
return it.uid
}
func (it *Iterator) Reset() {
it.iter.Close()
it.iter = it.ts.db.C(it.collection).Find(it.constraint).Iter()
it.iter = it.qs.db.C(it.collection).Find(it.constraint).Iter()
}
@ -98,15 +113,29 @@ func (it *Iterator) Close() {
it.iter.Close()
}
func (it *Iterator) Clone() graph.Iterator {
var newM graph.Iterator
if it.isAll {
newM = NewAllIterator(it.ts, it.collection)
} else {
newM = NewIterator(it.ts, it.collection, it.dir, it.hash)
func (it *Iterator) Tagger() *graph.Tagger {
return &it.tags
}
func (it *Iterator) TagResults(dst map[string]graph.Value) {
for _, tag := range it.tags.Tags() {
dst[tag] = it.Result()
}
newM.CopyTagsFrom(it)
return newM
for tag, value := range it.tags.Fixed() {
dst[tag] = value
}
}
func (it *Iterator) Clone() graph.Iterator {
var m *Iterator
if it.isAll {
m = NewAllIterator(it.qs, it.collection)
} else {
m = NewIterator(it.qs, it.collection, it.dir, it.hash)
}
m.tags.CopyFrom(it)
return m
}
func (it *Iterator) Next() (graph.Value, bool) {
@ -124,33 +153,50 @@ func (it *Iterator) Next() (graph.Value, bool) {
}
return nil, false
}
it.Last = result.Id
it.result = result.Id
return result.Id, true
}
func (it *Iterator) Check(v graph.Value) bool {
graph.CheckLogIn(it, v)
func (it *Iterator) ResultTree() *graph.ResultTree {
return graph.NewResultTree(it.Result())
}
func (it *Iterator) Result() graph.Value {
return it.result
}
func (it *Iterator) NextResult() bool {
return false
}
// No subiterators.
func (it *Iterator) SubIterators() []graph.Iterator {
return nil
}
func (it *Iterator) Contains(v graph.Value) bool {
graph.ContainsLogIn(it, v)
if it.isAll {
it.Last = v
return graph.CheckLogOut(it, v, true)
it.result = v
return graph.ContainsLogOut(it, v, true)
}
var offset int
switch it.dir {
case graph.Subject:
case quad.Subject:
offset = 0
case graph.Predicate:
offset = (it.ts.hasher.Size() * 2)
case graph.Object:
offset = (it.ts.hasher.Size() * 2) * 2
case graph.Provenance:
offset = (it.ts.hasher.Size() * 2) * 3
case quad.Predicate:
offset = (it.qs.hasher.Size() * 2)
case quad.Object:
offset = (it.qs.hasher.Size() * 2) * 2
case quad.Label:
offset = (it.qs.hasher.Size() * 2) * 3
}
val := v.(string)[offset : it.ts.hasher.Size()*2+offset]
val := v.(string)[offset : it.qs.hasher.Size()*2+offset]
if val == it.hash {
it.Last = v
return graph.CheckLogOut(it, v, true)
it.result = v
return graph.ContainsLogOut(it, v, true)
}
return graph.CheckLogOut(it, v, false)
return graph.ContainsLogOut(it, v, false)
}
func (it *Iterator) Size() (int64, bool) {
@ -183,8 +229,8 @@ func (it *Iterator) DebugString(indent int) string {
func (it *Iterator) Stats() graph.IteratorStats {
size, _ := it.Size()
return graph.IteratorStats{
CheckCost: 1,
NextCost: 5,
Size: size,
ContainsCost: 1,
NextCost: 5,
Size: size,
}
}

View file

@ -18,7 +18,7 @@ import (
"crypto/sha1"
"encoding/hex"
"hash"
"log"
"io"
"gopkg.in/mgo.v2"
"gopkg.in/mgo.v2/bson"
@ -26,8 +26,16 @@ import (
"github.com/barakmich/glog"
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func init() {
graph.RegisterTripleStore("mongo", newTripleStore, createNewMongoGraph)
}
// Guarantee we satisfy graph.Bulkloader.
var _ graph.BulkLoader = (*TripleStore)(nil)
const DefaultDBName = "cayley"
type TripleStore struct {
@ -60,13 +68,13 @@ func createNewMongoGraph(addr string, options graph.Options) error {
db.C("triples").EnsureIndex(indexOpts)
indexOpts.Key = []string{"Obj"}
db.C("triples").EnsureIndex(indexOpts)
indexOpts.Key = []string{"Provenance"}
indexOpts.Key = []string{"Label"}
db.C("triples").EnsureIndex(indexOpts)
return nil
}
func newTripleStore(addr string, options graph.Options) (graph.TripleStore, error) {
var ts TripleStore
var qs TripleStore
conn, err := mgo.Dial(addr)
if err != nil {
return nil, err
@ -76,26 +84,26 @@ func newTripleStore(addr string, options graph.Options) (graph.TripleStore, erro
if val, ok := options.StringKey("database_name"); ok {
dbName = val
}
ts.db = conn.DB(dbName)
ts.session = conn
ts.hasher = sha1.New()
ts.idCache = NewIDLru(1 << 16)
return &ts, nil
qs.db = conn.DB(dbName)
qs.session = conn
qs.hasher = sha1.New()
qs.idCache = NewIDLru(1 << 16)
return &qs, nil
}
func (ts *TripleStore) getIdForTriple(t *graph.Triple) string {
id := ts.ConvertStringToByteHash(t.Subject)
id += ts.ConvertStringToByteHash(t.Predicate)
id += ts.ConvertStringToByteHash(t.Object)
id += ts.ConvertStringToByteHash(t.Provenance)
func (qs *TripleStore) getIdForTriple(t *quad.Quad) string {
id := qs.ConvertStringToByteHash(t.Subject)
id += qs.ConvertStringToByteHash(t.Predicate)
id += qs.ConvertStringToByteHash(t.Object)
id += qs.ConvertStringToByteHash(t.Label)
return id
}
func (ts *TripleStore) ConvertStringToByteHash(s string) string {
ts.hasher.Reset()
key := make([]byte, 0, ts.hasher.Size())
ts.hasher.Write([]byte(s))
key = ts.hasher.Sum(key)
func (qs *TripleStore) ConvertStringToByteHash(s string) string {
qs.hasher.Reset()
key := make([]byte, 0, qs.hasher.Size())
qs.hasher.Write([]byte(s))
key = qs.hasher.Sum(key)
return hex.EncodeToString(key)
}
@ -105,10 +113,10 @@ type MongoNode struct {
Size int "Size"
}
func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
func (qs *TripleStore) updateNodeBy(node_name string, inc int) {
var size MongoNode
node := ts.ValueOf(node_name)
err := ts.db.C("nodes").FindId(node).One(&size)
node := qs.ValueOf(node_name)
err := qs.db.C("nodes").FindId(node).One(&size)
if err != nil {
if err.Error() == "not found" {
// Not found. Okay.
@ -116,7 +124,7 @@ func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
size.Name = node_name
size.Size = inc
} else {
glog.Error("Error:", err)
glog.Errorf("Error: %v", err)
return
}
} else {
@ -128,134 +136,134 @@ func (ts *TripleStore) updateNodeBy(node_name string, inc int) {
// Removing something...
if inc < 0 {
if size.Size <= 0 {
err := ts.db.C("nodes").RemoveId(node)
err := qs.db.C("nodes").RemoveId(node)
if err != nil {
glog.Error("Error: ", err, " while removing node ", node_name)
glog.Errorf("Error: %v while removing node %s", err, node_name)
return
}
}
}
_, err2 := ts.db.C("nodes").UpsertId(node, size)
_, err2 := qs.db.C("nodes").UpsertId(node, size)
if err2 != nil {
glog.Error("Error: ", err)
glog.Errorf("Error: %v", err)
}
}
func (ts *TripleStore) writeTriple(t *graph.Triple) bool {
func (qs *TripleStore) writeTriple(t *quad.Quad) bool {
tripledoc := bson.M{
"_id": ts.getIdForTriple(t),
"Subject": t.Subject,
"Predicate": t.Predicate,
"Object": t.Object,
"Provenance": t.Provenance,
"_id": qs.getIdForTriple(t),
"Subject": t.Subject,
"Predicate": t.Predicate,
"Object": t.Object,
"Label": t.Label,
}
err := ts.db.C("triples").Insert(tripledoc)
err := qs.db.C("triples").Insert(tripledoc)
if err != nil {
// Among the reasons I hate MongoDB. "Errors don't happen! Right guys?"
if err.(*mgo.LastError).Code == 11000 {
return false
}
glog.Error("Error: ", err)
glog.Errorf("Error: %v", err)
return false
}
return true
}
func (ts *TripleStore) AddTriple(t *graph.Triple) {
_ = ts.writeTriple(t)
ts.updateNodeBy(t.Subject, 1)
ts.updateNodeBy(t.Predicate, 1)
ts.updateNodeBy(t.Object, 1)
if t.Provenance != "" {
ts.updateNodeBy(t.Provenance, 1)
func (qs *TripleStore) AddTriple(t *quad.Quad) {
_ = qs.writeTriple(t)
qs.updateNodeBy(t.Subject, 1)
qs.updateNodeBy(t.Predicate, 1)
qs.updateNodeBy(t.Object, 1)
if t.Label != "" {
qs.updateNodeBy(t.Label, 1)
}
}
func (ts *TripleStore) AddTripleSet(in []*graph.Triple) {
ts.session.SetSafe(nil)
func (qs *TripleStore) AddTripleSet(in []*quad.Quad) {
qs.session.SetSafe(nil)
ids := make(map[string]int)
for _, t := range in {
wrote := ts.writeTriple(t)
wrote := qs.writeTriple(t)
if wrote {
ids[t.Subject]++
ids[t.Object]++
ids[t.Predicate]++
if t.Provenance != "" {
ids[t.Provenance]++
if t.Label != "" {
ids[t.Label]++
}
}
}
for k, v := range ids {
ts.updateNodeBy(k, v)
qs.updateNodeBy(k, v)
}
ts.session.SetSafe(&mgo.Safe{})
qs.session.SetSafe(&mgo.Safe{})
}
func (ts *TripleStore) RemoveTriple(t *graph.Triple) {
err := ts.db.C("triples").RemoveId(ts.getIdForTriple(t))
func (qs *TripleStore) RemoveTriple(t *quad.Quad) {
err := qs.db.C("triples").RemoveId(qs.getIdForTriple(t))
if err == mgo.ErrNotFound {
return
} else if err != nil {
log.Println("Error: ", err, " while removing triple ", t)
glog.Errorf("Error: %v while removing triple %v", err, t)
return
}
ts.updateNodeBy(t.Subject, -1)
ts.updateNodeBy(t.Predicate, -1)
ts.updateNodeBy(t.Object, -1)
if t.Provenance != "" {
ts.updateNodeBy(t.Provenance, -1)
qs.updateNodeBy(t.Subject, -1)
qs.updateNodeBy(t.Predicate, -1)
qs.updateNodeBy(t.Object, -1)
if t.Label != "" {
qs.updateNodeBy(t.Label, -1)
}
}
func (ts *TripleStore) Triple(val graph.Value) *graph.Triple {
func (qs *TripleStore) Quad(val graph.Value) *quad.Quad {
var bsonDoc bson.M
err := ts.db.C("triples").FindId(val.(string)).One(&bsonDoc)
err := qs.db.C("triples").FindId(val.(string)).One(&bsonDoc)
if err != nil {
log.Println("Error: Couldn't retrieve triple", val.(string), err)
glog.Errorf("Error: Couldn't retrieve triple %s %v", val, err)
}
return &graph.Triple{
return &quad.Quad{
bsonDoc["Subject"].(string),
bsonDoc["Predicate"].(string),
bsonDoc["Object"].(string),
bsonDoc["Provenance"].(string),
bsonDoc["Label"].(string),
}
}
func (ts *TripleStore) TripleIterator(d graph.Direction, val graph.Value) graph.Iterator {
return NewIterator(ts, "triples", d, val)
func (qs *TripleStore) TripleIterator(d quad.Direction, val graph.Value) graph.Iterator {
return NewIterator(qs, "triples", d, val)
}
func (ts *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator(ts, "nodes")
func (qs *TripleStore) NodesAllIterator() graph.Iterator {
return NewAllIterator(qs, "nodes")
}
func (ts *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator(ts, "triples")
func (qs *TripleStore) TriplesAllIterator() graph.Iterator {
return NewAllIterator(qs, "triples")
}
func (ts *TripleStore) ValueOf(s string) graph.Value {
return ts.ConvertStringToByteHash(s)
func (qs *TripleStore) ValueOf(s string) graph.Value {
return qs.ConvertStringToByteHash(s)
}
func (ts *TripleStore) NameOf(v graph.Value) string {
val, ok := ts.idCache.Get(v.(string))
func (qs *TripleStore) NameOf(v graph.Value) string {
val, ok := qs.idCache.Get(v.(string))
if ok {
return val
}
var node MongoNode
err := ts.db.C("nodes").FindId(v.(string)).One(&node)
err := qs.db.C("nodes").FindId(v.(string)).One(&node)
if err != nil {
log.Println("Error: Couldn't retrieve node", v.(string), err)
glog.Errorf("Error: Couldn't retrieve node %s %v", v, err)
}
ts.idCache.Put(v.(string), node.Name)
qs.idCache.Put(v.(string), node.Name)
return node.Name
}
func (ts *TripleStore) Size() int64 {
count, err := ts.db.C("triples").Count()
func (qs *TripleStore) Size() int64 {
count, err := qs.db.C("triples").Count()
if err != nil {
glog.Error("Error: ", err)
glog.Errorf("Error: %v", err)
return 0
}
return int64(count)
@ -265,40 +273,48 @@ func compareStrings(a, b graph.Value) bool {
return a.(string) == b.(string)
}
func (ts *TripleStore) FixedIterator() graph.FixedIterator {
func (qs *TripleStore) FixedIterator() graph.FixedIterator {
return iterator.NewFixedIteratorWithCompare(compareStrings)
}
func (ts *TripleStore) Close() {
ts.db.Session.Close()
func (qs *TripleStore) Close() {
qs.db.Session.Close()
}
func (ts *TripleStore) TripleDirection(in graph.Value, d graph.Direction) graph.Value {
func (qs *TripleStore) TripleDirection(in graph.Value, d quad.Direction) graph.Value {
// Maybe do the trick here
var offset int
switch d {
case graph.Subject:
case quad.Subject:
offset = 0
case graph.Predicate:
offset = (ts.hasher.Size() * 2)
case graph.Object:
offset = (ts.hasher.Size() * 2) * 2
case graph.Provenance:
offset = (ts.hasher.Size() * 2) * 3
case quad.Predicate:
offset = (qs.hasher.Size() * 2)
case quad.Object:
offset = (qs.hasher.Size() * 2) * 2
case quad.Label:
offset = (qs.hasher.Size() * 2) * 3
}
val := in.(string)[offset : ts.hasher.Size()*2+offset]
val := in.(string)[offset : qs.hasher.Size()*2+offset]
return val
}
func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
if ts.Size() != 0 {
return false
func (qs *TripleStore) BulkLoad(dec quad.Unmarshaler) error {
if qs.Size() != 0 {
return graph.ErrCannotBulkLoad
}
ts.session.SetSafe(nil)
for triple := range t_chan {
ts.writeTriple(triple)
qs.session.SetSafe(nil)
for {
q, err := dec.Unmarshal()
if err != nil {
if err != io.EOF {
return err
}
break
}
qs.writeTriple(q)
}
outputTo := bson.M{"replace": "nodes", "sharded": true}
glog.Infoln("Mapreducing")
job := mgo.MapReduce{
@ -311,8 +327,8 @@ func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
emit(s_key, {"_id": s_key, "Name" : this.Subject, "Size" : 1})
emit(p_key, {"_id": p_key, "Name" : this.Predicate, "Size" : 1})
emit(o_key, {"_id": o_key, "Name" : this.Object, "Size" : 1})
if (this.Provenance != "") {
emit(c_key, {"_id": c_key, "Name" : this.Provenance, "Size" : 1})
if (this.Label != "") {
emit(c_key, {"_id": c_key, "Name" : this.Label, "Size" : 1})
}
}
`,
@ -330,16 +346,13 @@ func (ts *TripleStore) BulkLoad(t_chan chan *graph.Triple) bool {
`,
Out: outputTo,
}
ts.db.C("triples").Find(nil).MapReduce(&job, nil)
qs.db.C("triples").Find(nil).MapReduce(&job, nil)
glog.Infoln("Fixing")
ts.db.Run(bson.D{{"eval", `function() { db.nodes.find().forEach(function (result) {
qs.db.Run(bson.D{{"eval", `function() { db.nodes.find().forEach(function (result) {
db.nodes.update({"_id": result._id}, result.value)
}) }`}, {"args", bson.D{}}}, nil)
ts.session.SetSafe(&mgo.Safe{})
return true
}
qs.session.SetSafe(&mgo.Safe{})
func init() {
graph.RegisterTripleStore("mongo", newTripleStore, createNewMongoGraph)
return nil
}

View file

@ -37,14 +37,15 @@ func (ts *TripleStore) optimizeLinksTo(it *iterator.LinksTo) (graph.Iterator, bo
if primary.Type() == graph.Fixed {
size, _ := primary.Size()
if size == 1 {
val, ok := primary.Next()
val, ok := graph.Next(primary)
if !ok {
panic("Sizes lie")
}
newIt := ts.TripleIterator(it.Direction(), val)
newIt.CopyTagsFrom(it)
for _, tag := range primary.Tags() {
newIt.AddFixedTag(tag, val)
nt := newIt.Tagger()
nt.CopyFrom(it)
for _, tag := range primary.Tagger().Tags() {
nt.AddFixed(tag, val)
}
it.Close()
return newIt, true

View file

@ -40,7 +40,7 @@ func (t *ResultTree) AddSubtree(sub *ResultTree) {
t.subtrees = append(t.subtrees, sub)
}
func StringResultTreeEvaluator(it Iterator) string {
func StringResultTreeEvaluator(it Nexter) string {
ok := true
out := ""
for {
@ -59,6 +59,6 @@ func StringResultTreeEvaluator(it Iterator) string {
return out
}
func PrintResultTreeEvaluator(it Iterator) {
func PrintResultTreeEvaluator(it Nexter) {
fmt.Print(StringResultTreeEvaluator(it))
}

View file

@ -26,7 +26,7 @@ func TestSingleIterator(t *testing.T) {
result := StringResultTreeEvaluator(all)
expected := "(1)\n(2)\n(3)\n"
if expected != result {
t.Errorf("Expected \"%s\" got \"%s\"", expected, result)
t.Errorf("Expected %q got %q", expected, result)
}
}
@ -40,6 +40,6 @@ func TestAndIterator(t *testing.T) {
result := StringResultTreeEvaluator(and)
expected := "(3 (3) (3))\n"
if expected != result {
t.Errorf("Expected \"%s\" got \"%s\"", expected, result)
t.Errorf("Expected %q got %q", expected, result)
}
}

View file

@ -23,7 +23,9 @@ package graph
import (
"errors"
"github.com/barakmich/glog"
"github.com/google/cayley/quad"
)
// Defines an opaque "triple store value" type. However the backend wishes to
@ -42,11 +44,11 @@ type TripleStore interface {
ApplyTransactions([]*Transaction) error
// Given an opaque token, returns the triple for that token from the store.
Triple(Value) *Triple
Quad(Value) *quad.Quad
// Given a direction and a token, creates an iterator of links which have
// that node token in that directional field.
TripleIterator(Direction, Value) Iterator
TripleIterator(quad.Direction, Value) Iterator
// Returns an iterator enumerating all nodes in the graph.
NodesAllIterator() Iterator
@ -86,8 +88,8 @@ type TripleStore interface {
// gives the TripleStore the opportunity to make this optimization.
//
// Iterators will call this. At worst, a valid implementation is
// ts.IdFor(ts.Triple(triple_id).Get(dir))
TripleDirection(triple_id Value, d Direction) Value
// ts.IdFor(ts.quad.Quad(id).Get(dir))
TripleDirection(id Value, d quad.Direction) Value
}
type Options map[string]interface{}
@ -119,14 +121,10 @@ func (d Options) StringKey(key string) (string, bool) {
var ErrCannotBulkLoad = errors.New("triplestore: cannot bulk load")
type BulkLoader interface {
// BulkLoad loads Triples from a TripleUnmarshaler in bulk to the TripleStore.
// BulkLoad loads Quads from a quad.Unmarshaler in bulk to the TripleStore.
// It returns ErrCannotBulkLoad if bulk loading is not possible. For example if
// you cannot load in bulk to a non-empty database, and the db is non-empty.
BulkLoad(TripleUnmarshaler) error
}
type TripleUnmarshaler interface {
Unmarshal() (*Triple, error)
BulkLoad(quad.Unmarshaler) error
}
type NewStoreFunc func(string, Options) (TripleStore, error)

View file

@ -19,22 +19,22 @@ import (
"reflect"
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
)
var parseTests = []struct {
message string
input string
expect []*graph.Triple
expect []*quad.Quad
err error
}{
{
message: "parse correct JSON",
input: `[
{"subject": "foo", "predicate": "bar", "object": "baz"},
{"subject": "foo", "predicate": "bar", "object": "baz", "provenance": "graph"}
{"subject": "foo", "predicate": "bar", "object": "baz", "label": "graph"}
]`,
expect: []*graph.Triple{
expect: []*quad.Quad{
{"foo", "bar", "baz", ""},
{"foo", "bar", "baz", "graph"},
},
@ -45,7 +45,7 @@ var parseTests = []struct {
input: `[
{"subject": "foo", "predicate": "bar", "object": "foo", "something_else": "extra data"}
]`,
expect: []*graph.Triple{
expect: []*quad.Quad{
{"foo", "bar", "foo", ""},
},
err: nil,
@ -56,7 +56,7 @@ var parseTests = []struct {
{"subject": "foo", "predicate": "bar"}
]`,
expect: nil,
err: fmt.Errorf("Invalid triple at index %d. %v", 0, &graph.Triple{"foo", "bar", "", ""}),
err: fmt.Errorf("Invalid triple at index %d. %v", 0, &quad.Quad{"foo", "bar", "", ""}),
},
}

View file

@ -22,7 +22,7 @@ import (
"github.com/julienschmidt/httprouter"
"github.com/google/cayley/graph"
"github.com/google/cayley/query"
"github.com/google/cayley/query/gremlin"
"github.com/google/cayley/query/mql"
)
@ -47,7 +47,7 @@ func WrapResult(result interface{}) ([]byte, error) {
return json.MarshalIndent(wrap, "", " ")
}
func RunJsonQuery(query string, ses graph.HttpSession) (interface{}, error) {
func RunJsonQuery(query string, ses query.HttpSession) (interface{}, error) {
c := make(chan interface{}, 5)
go ses.ExecInput(query, c, 100)
for res := range c {
@ -56,7 +56,7 @@ func RunJsonQuery(query string, ses graph.HttpSession) (interface{}, error) {
return ses.GetJson()
}
func GetQueryShape(query string, ses graph.HttpSession) ([]byte, error) {
func GetQueryShape(query string, ses query.HttpSession) ([]byte, error) {
c := make(chan map[string]interface{}, 5)
go ses.GetQuery(query, c)
var data map[string]interface{}
@ -68,10 +68,10 @@ func GetQueryShape(query string, ses graph.HttpSession) ([]byte, error) {
// TODO(barakmich): Turn this into proper middleware.
func (api *Api) ServeV1Query(w http.ResponseWriter, r *http.Request, params httprouter.Params) int {
var ses graph.HttpSession
var ses query.HttpSession
switch params.ByName("query_lang") {
case "gremlin":
ses = gremlin.NewSession(api.ts, api.config.GremlinTimeout, false)
ses = gremlin.NewSession(api.ts, api.config.Timeout, false)
case "mql":
ses = mql.NewSession(api.ts)
default:
@ -84,7 +84,7 @@ func (api *Api) ServeV1Query(w http.ResponseWriter, r *http.Request, params http
code := string(bodyBytes)
result, err := ses.InputParses(code)
switch result {
case graph.Parsed:
case query.Parsed:
var output interface{}
var bytes []byte
var err error
@ -103,7 +103,7 @@ func (api *Api) ServeV1Query(w http.ResponseWriter, r *http.Request, params http
fmt.Fprint(w, string(bytes))
ses = nil
return 200
case graph.ParseFail:
case query.ParseFail:
ses = nil
return FormatJson400(w, err)
default:
@ -116,10 +116,10 @@ func (api *Api) ServeV1Query(w http.ResponseWriter, r *http.Request, params http
}
func (api *Api) ServeV1Shape(w http.ResponseWriter, r *http.Request, params httprouter.Params) int {
var ses graph.HttpSession
var ses query.HttpSession
switch params.ByName("query_lang") {
case "gremlin":
ses = gremlin.NewSession(api.ts, api.config.GremlinTimeout, false)
ses = gremlin.NewSession(api.ts, api.config.Timeout, false)
case "mql":
ses = mql.NewSession(api.ts)
default:
@ -132,7 +132,7 @@ func (api *Api) ServeV1Shape(w http.ResponseWriter, r *http.Request, params http
code := string(bodyBytes)
result, err := ses.InputParses(code)
switch result {
case graph.Parsed:
case query.Parsed:
var output []byte
var err error
output, err = GetQueryShape(code, ses)
@ -141,7 +141,7 @@ func (api *Api) ServeV1Shape(w http.ResponseWriter, r *http.Request, params http
}
fmt.Fprint(w, string(output))
return 200
case graph.ParseFail:
case query.ParseFail:
return FormatJson400(w, err)
default:
return FormatJsonError(w, 500, "Incomplete data?")

View file

@ -25,12 +25,12 @@ import (
"github.com/barakmich/glog"
"github.com/julienschmidt/httprouter"
"github.com/google/cayley/graph"
"github.com/google/cayley/nquads"
"github.com/google/cayley/quad"
"github.com/google/cayley/quad/nquads"
)
func ParseJsonToTripleList(jsonBody []byte) ([]*graph.Triple, error) {
var tripleList []*graph.Triple
func ParseJsonToTripleList(jsonBody []byte) ([]*quad.Quad, error) {
var tripleList []*quad.Quad
err := json.Unmarshal(jsonBody, &tripleList)
if err != nil {
return nil, err
@ -83,7 +83,7 @@ func (api *Api) ServeV1WriteNQuad(w http.ResponseWriter, r *http.Request, params
var (
n int
block = make([]*graph.Triple, 0, blockSize)
block = make([]*quad.Quad, 0, blockSize)
)
for {
t, err := dec.Unmarshal()

View file

@ -1,138 +0,0 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package nquads
import (
"reflect"
"testing"
"github.com/google/cayley/graph"
)
var testNTriples = []struct {
message string
input string
expect *graph.Triple
err error
}{
// NTriple tests.
{
message: "not parse invalid triples",
input: "invalid",
expect: nil,
err: ErrAbsentPredicate,
},
{
message: "invalid internal quote",
input: `":103032" "/film/performance/character" "Walter "Teacher" Cole" .`,
expect: nil,
err: ErrUnterminated,
},
{
message: "not parse comments",
input: "# nominally valid triple .",
expect: nil,
err: nil,
},
{
message: "parse simple triples",
input: "this is valid .",
expect: &graph.Triple{"this", "is", "valid", ""},
},
{
message: "parse quoted triples",
input: `this is "valid too" .`,
expect: &graph.Triple{"this", "is", "valid too", ""},
},
{
message: "parse escaped quoted triples",
input: `he said "\"That's all folks\"" .`,
expect: &graph.Triple{"he", "said", `"That's all folks"`, ""},
},
{
message: "parse an example real triple",
input: `":/guid/9202a8c04000641f80000000010c843c" "name" "George Morris" .`,
expect: &graph.Triple{":/guid/9202a8c04000641f80000000010c843c", "name", "George Morris", ""},
},
{
message: "parse a pathologically spaced triple",
input: "foo is \"\\tA big tough\\r\\nDeal\\\\\" .",
expect: &graph.Triple{"foo", "is", "\tA big tough\r\nDeal\\", ""},
},
// NQuad tests.
{
message: "parse a simple quad",
input: "this is valid quad .",
expect: &graph.Triple{"this", "is", "valid", "quad"},
},
{
message: "parse a quoted quad",
input: `this is valid "quad thing" .`,
expect: &graph.Triple{"this", "is", "valid", "quad thing"},
},
{
message: "parse crazy escaped quads",
input: `"\"this" "\"is" "\"valid" "\"quad thing".`,
expect: &graph.Triple{`"this`, `"is`, `"valid`, `"quad thing`},
},
// NTriple official tests.
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> <http://example/o> . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "http://example/o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> _:o . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "_:o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\" . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\"^^<http://example/dt> . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
{
message: "handle simple case with comments",
input: "<http://example/s> <http://example/p> \"o\"@en . # comment",
expect: &graph.Triple{"http://example/s", "http://example/p", "o", ""},
},
}
func TestParse(t *testing.T) {
for _, test := range testNTriples {
got, err := Parse(test.input)
if err != test.err {
t.Errorf("Unexpected error when %s: got:%v expect:%v", test.message, err, test.err)
}
if !reflect.DeepEqual(got, test.expect) {
t.Errorf("Failed to %s, %q, got:%q expect:%q", test.message, test.input, got, test.expect)
}
}
}
var result *graph.Triple
func BenchmarkParser(b *testing.B) {
for n := 0; n < b.N; n++ {
result, _ = Parse("<http://example/s> <http://example/p> \"object of some real\\tlength\"@en . # comment")
}
}

95
quad/cquads/actions.rl Normal file
View file

@ -0,0 +1,95 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
%%{
machine quads;
action Escape {
isEscaped = true
}
action Quote {
isQuoted = true
}
action StartSubject {
subject = p
}
action StartPredicate {
predicate = p
}
action StartObject {
object = p
}
action StartLabel {
label = p
}
action SetSubject {
if subject < 0 {
panic("unexpected parser state: subject start not set")
}
q.Subject = unEscape(data[subject:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetPredicate {
if predicate < 0 {
panic("unexpected parser state: predicate start not set")
}
q.Predicate = unEscape(data[predicate:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetObject {
if object < 0 {
panic("unexpected parser state: object start not set")
}
q.Object = unEscape(data[object:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action SetLabel {
if label < 0 {
panic("unexpected parser state: label start not set")
}
q.Label = unEscape(data[label:p], isQuoted, isEscaped)
isEscaped = false
isQuoted = false
}
action Return {
return q, nil
}
action Comment {
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return q, fmt.Errorf("%v: unexpected rune %q at %d", quad.ErrInvalid, data[p], p)
} else {
return q, fmt.Errorf("%v: unexpected rune %q (\\u%04x) at %d", quad.ErrInvalid, data[p], data[p], p)
}
}
return q, quad.ErrIncomplete
}
}%%

144
quad/cquads/cquads.go Normal file
View file

@ -0,0 +1,144 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Package cquads implements parsing N-Quads like line-based syntax
// for RDF datasets.
//
// N-Quad parsing is performed as based on a simplified grammar derived from
// the N-Quads grammar defined by http://www.w3.org/TR/n-quads/.
//
// For a complete definition of the grammar, see cquads.rl.
package cquads
import (
"bufio"
"bytes"
"fmt"
"io"
"strconv"
"github.com/google/cayley/quad"
)
// Parse returns a valid quad.Quad or a non-nil error. Parse does
// handle comments except where the comment placement does not prevent
// a complete valid quad.Quad from being defined.
func Parse(str string) (*quad.Quad, error) {
q, err := parse([]rune(str))
return &q, err
}
// Decoder implements simplified N-Quad document parsing.
type Decoder struct {
r *bufio.Reader
line []byte
}
// NewDecoder returns an N-Quad decoder that takes its input from the
// provided io.Reader.
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{r: bufio.NewReader(r)}
}
// Unmarshal returns the next valid N-Quad as a quad.Quad, or an error.
func (dec *Decoder) Unmarshal() (*quad.Quad, error) {
dec.line = dec.line[:0]
var line []byte
for {
for {
l, pre, err := dec.r.ReadLine()
if err != nil {
return nil, err
}
dec.line = append(dec.line, l...)
if !pre {
break
}
}
if line = bytes.TrimSpace(dec.line); len(line) != 0 && line[0] != '#' {
break
}
dec.line = dec.line[:0]
}
triple, err := Parse(string(line))
if err != nil {
return nil, fmt.Errorf("failed to parse %q: %v", dec.line, err)
}
if triple == nil {
return dec.Unmarshal()
}
return triple, nil
}
func unEscape(r []rune, isQuoted, isEscaped bool) string {
if isQuoted {
r = r[1 : len(r)-1]
}
if len(r) >= 2 && r[0] == '<' && r[len(r)-1] == '>' {
return string(r[1 : len(r)-1])
}
if !isEscaped {
return string(r)
}
buf := bytes.NewBuffer(make([]byte, 0, len(r)))
for i := 0; i < len(r); {
switch r[i] {
case '\\':
i++
var c byte
switch r[i] {
case 't':
c = '\t'
case 'b':
c = '\b'
case 'n':
c = '\n'
case 'r':
c = '\r'
case 'f':
c = '\f'
case '"':
c = '"'
case '\'':
c = '\''
case '\\':
c = '\\'
case 'u':
rc, err := strconv.ParseInt(string(r[i+1:i+5]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 5
continue
case 'U':
rc, err := strconv.ParseInt(string(r[i+1:i+9]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 9
continue
}
buf.WriteByte(c)
default:
buf.WriteRune(r[i])
}
i++
}
return buf.String()
}

106
quad/cquads/cquads.rl Normal file
View file

@ -0,0 +1,106 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Ragel gramar definition derived from http://www.w3.org/TR/n-quads/#sec-grammar.
%%{
machine quads;
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
ECHAR = ('\\' [tbnrf"'\\]) %Escape ;
UCHAR = ('\\u' xdigit {4}
| '\\U' xdigit {8}) %Escape
;
BLANK_NODE_LABEL = '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
STRING_LITERAL = (
'!'
| '#' .. '['
| ']' .. 0x7e
| 0x80 .. 0x10ffff
| ECHAR
| UCHAR)+ - ('_:' | any* '.' | '#' any*)
;
STRING_LITERAL_QUOTE = '"' (
0x00 .. 0x09
| 0x0b .. 0x0c
| 0x0e .. '!'
| '#' .. '['
| ']' .. 0x10ffff
| ECHAR
| UCHAR)*
'"'
;
IRIREF = '<' (
'!' .. ';'
| '='
| '?' .. '['
| ']'
| '_'
| 'a' .. 'z'
| '~'
| 0x80 .. 0x10ffff
| UCHAR)*
'>'
;
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
whitespace = [ \t] ;
literal = STRING_LITERAL | STRING_LITERAL_QUOTE % Quote | STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG) ;
subject = (literal | BLANK_NODE_LABEL) ;
predicate = literal ;
object = (literal | BLANK_NODE_LABEL) ;
graphLabel = (literal | BLANK_NODE_LABEL) ;
statement := (
whitespace* subject >StartSubject %SetSubject
whitespace+ predicate >StartPredicate %SetPredicate
whitespace+ object >StartObject %SetObject
(whitespace+ graphLabel >StartLabel %SetLabel)?
whitespace* '.' whitespace* ('#' any*)? >Comment
) %Return @!Error ;
}%%

782
quad/cquads/cquads_test.go Normal file

File diff suppressed because it is too large Load diff

6692
quad/cquads/parse.go Normal file

File diff suppressed because it is too large Load diff

58
quad/cquads/parse.rl Normal file
View file

@ -0,0 +1,58 @@
// GO SOURCE FILE MACHINE GENERATED BY RAGEL; DO NOT EDIT
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package cquads
import (
"fmt"
"unicode"
"github.com/google/cayley/quad"
)
%%{
machine quads;
include "actions.rl";
include "cquads.rl";
write data;
}%%
func parse(data []rune) (quad.Quad, error) {
var (
cs, p int
pe = len(data)
eof = pe
subject = -1
predicate = -1
object = -1
label = -1
isEscaped bool
isQuoted bool
q quad.Quad
)
%%write init;
%%write exec;
return quad.Quad{}, quad.ErrInvalid
}

BIN
quad/nquad_tests.tar.gz Normal file

Binary file not shown.

87
quad/nquads/actions.rl Normal file
View file

@ -0,0 +1,87 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
%%{
machine quads;
action Escape {
isEscaped = true
}
action StartSubject {
subject = p
}
action StartPredicate {
predicate = p
}
action StartObject {
object = p
}
action StartLabel {
label = p
}
action SetSubject {
if subject < 0 {
panic("unexpected parser state: subject start not set")
}
q.Subject = unEscape(data[subject:p], isEscaped)
isEscaped = false
}
action SetPredicate {
if predicate < 0 {
panic("unexpected parser state: predicate start not set")
}
q.Predicate = unEscape(data[predicate:p], isEscaped)
isEscaped = false
}
action SetObject {
if object < 0 {
panic("unexpected parser state: object start not set")
}
q.Object = unEscape(data[object:p], isEscaped)
isEscaped = false
}
action SetLabel {
if label < 0 {
panic("unexpected parser state: label start not set")
}
q.Label = unEscape(data[label:p], isEscaped)
isEscaped = false
}
action Return {
return q, nil
}
action Comment {
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return q, fmt.Errorf("%v: unexpected rune %q at %d", quad.ErrInvalid, data[p], p)
} else {
return q, fmt.Errorf("%v: unexpected rune %q (\\u%04x) at %d", quad.ErrInvalid, data[p], data[p], p)
}
}
return q, quad.ErrIncomplete
}
}%%

138
quad/nquads/nquads.go Normal file
View file

@ -0,0 +1,138 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Package nquads implements parsing the RDF 1.1 N-Quads line-based syntax
// for RDF datasets.
//
// N-Quad parsing is performed as defined by http://www.w3.org/TR/n-quads/
// with the exception that the nquads package will allow relative IRI values,
// which are prohibited by the N-Quads quad-Quads specifications.
package nquads
import (
"bufio"
"bytes"
"fmt"
"io"
"strconv"
"github.com/google/cayley/quad"
)
// Parse returns a valid quad.Quad or a non-nil error. Parse does
// handle comments except where the comment placement does not prevent
// a complete valid quad.Quad from being defined.
func Parse(str string) (*quad.Quad, error) {
q, err := parse([]rune(str))
return &q, err
}
// Decoder implements N-Quad document parsing according to the RDF
// 1.1 N-Quads specification.
type Decoder struct {
r *bufio.Reader
line []byte
}
// NewDecoder returns an N-Quad decoder that takes its input from the
// provided io.Reader.
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{r: bufio.NewReader(r)}
}
// Unmarshal returns the next valid N-Quad as a quad.Quad, or an error.
func (dec *Decoder) Unmarshal() (*quad.Quad, error) {
dec.line = dec.line[:0]
var line []byte
for {
for {
l, pre, err := dec.r.ReadLine()
if err != nil {
return nil, err
}
dec.line = append(dec.line, l...)
if !pre {
break
}
}
if line = bytes.TrimSpace(dec.line); len(line) != 0 && line[0] != '#' {
break
}
dec.line = dec.line[:0]
}
triple, err := Parse(string(line))
if err != nil {
return nil, fmt.Errorf("failed to parse %q: %v", dec.line, err)
}
if triple == nil {
return dec.Unmarshal()
}
return triple, nil
}
func unEscape(r []rune, isEscaped bool) string {
if !isEscaped {
return string(r)
}
buf := bytes.NewBuffer(make([]byte, 0, len(r)))
for i := 0; i < len(r); {
switch r[i] {
case '\\':
i++
var c byte
switch r[i] {
case 't':
c = '\t'
case 'b':
c = '\b'
case 'n':
c = '\n'
case 'r':
c = '\r'
case 'f':
c = '\f'
case '"':
c = '"'
case '\'':
c = '\''
case '\\':
c = '\\'
case 'u':
rc, err := strconv.ParseInt(string(r[i+1:i+5]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 5
continue
case 'U':
rc, err := strconv.ParseInt(string(r[i+1:i+9]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %v", err))
}
buf.WriteRune(rune(rc))
i += 9
continue
}
buf.WriteByte(c)
default:
buf.WriteRune(r[i])
}
i++
}
return buf.String()
}

97
quad/nquads/nquads.rl Normal file
View file

@ -0,0 +1,97 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// Ragel gramar definition derived from http://www.w3.org/TR/n-quads/#sec-grammar.
%%{
machine quads;
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
ECHAR = ('\\' [tbnrf"'\\]) %Escape ;
UCHAR = ('\\u' xdigit {4}
| '\\U' xdigit {8}) %Escape
;
BLANK_NODE_LABEL = '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
STRING_LITERAL_QUOTE = '"' (
0x00 .. 0x09
| 0x0b .. 0x0c
| 0x0e .. '!'
| '#' .. '['
| ']' .. 0x10ffff
| ECHAR
| UCHAR)*
'"'
;
IRIREF = '<' (
'!' .. ';'
| '='
| '?' .. '['
| ']'
| '_'
| 'a' .. 'z'
| '~'
| 0x80 .. 0x10ffff
| UCHAR)*
'>'
;
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
whitespace = [ \t] ;
literal = STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG)? ;
subject = IRIREF | BLANK_NODE_LABEL ;
predicate = IRIREF ;
object = IRIREF | BLANK_NODE_LABEL | literal ;
graphLabel = IRIREF | BLANK_NODE_LABEL ;
statement := (
whitespace* subject >StartSubject %SetSubject
whitespace* predicate >StartPredicate %SetPredicate
whitespace* object >StartObject %SetObject
(whitespace* graphLabel >StartLabel %SetLabel)?
whitespace* '.' whitespace* ('#' any*)? >Comment
) %Return @!Error ;
}%%

589
quad/nquads/nquads_test.go Normal file

File diff suppressed because it is too large Load diff

3652
quad/nquads/parse.go Normal file

File diff suppressed because it is too large Load diff

57
quad/nquads/parse.rl Normal file
View file

@ -0,0 +1,57 @@
// GO SOURCE FILE MACHINE GENERATED BY RAGEL; DO NOT EDIT
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package nquads
import (
"fmt"
"unicode"
"github.com/google/cayley/quad"
)
%%{
machine quads;
include "actions.rl";
include "nquads.rl";
write data;
}%%
func parse(data []rune) (quad.Quad, error) {
var (
cs, p int
pe = len(data)
eof = pe
subject = -1
predicate = -1
object = -1
label = -1
isEscaped bool
q quad.Quad
)
%%write init;
%%write exec;
return quad.Quad{}, quad.ErrInvalid
}

View file

@ -1,29 +1,52 @@
// Copyright 2014 The Cayley Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// +build ignore
package nquads
package main
import (
"bufio"
"errors"
"fmt"
"io"
"log"
"os"
"strings"
"github.com/google/cayley/graph"
)
func main() {
dec := NewDecoder(os.Stdin)
for {
t, err := dec.Unmarshal()
if err != nil {
if err == io.EOF {
return
}
log.Println(err)
}
if t.Subject[0] == ':' && t.Subject[1] == '/' {
t.Subject = "<" + t.Subject[1:] + ">"
} else {
t.Subject = "_" + t.Subject
}
if t.Object[0] == ':' {
if t.Object[1] == '/' {
t.Object = "<" + t.Object[1:] + ">"
} else {
t.Object = "_" + t.Object
}
} else if t.Object[0] == '/' {
t.Object = "<" + t.Object + ">"
} else {
t.Object = fmt.Sprintf(`%q`, t.Object)
}
fmt.Printf("%s <%s> %s .\n", t.Subject, t.Predicate, t.Object)
}
}
// Historical N-Quads parser code.
// -------------------------------
var (
ErrAbsentSubject = errors.New("nqauds: absent subject")
ErrAbsentPredicate = errors.New("nqauds: absent predicate")

BIN
quad/ntriple_tests.tar.gz Normal file

Binary file not shown.

View file

@ -12,7 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.
package graph
// Package quad defines quad and triple handling.
package quad
// Defines the struct which makes the TripleStore possible -- the triple.
//
@ -25,7 +26,7 @@ package graph
// list of triples. The rest is just indexing for speed.
//
// Adding fields to the triple is not to be taken lightly. You'll see I mention
// provenance, but don't as yet use it in any backing store. In general, there
// label, but don't as yet use it in any backing store. In general, there
// can be features that can be turned on or off for any store, but I haven't
// decided how to allow/disallow them yet. Another such example would be to add
// a forward and reverse index field -- forward being "order the list of
@ -35,17 +36,22 @@ package graph
// There will never be that much in this file except for the definition, but
// the consequences are not to be taken lightly. But do suggest cool features!
import "fmt"
import (
"errors"
"fmt"
)
// TODO(kortschak) Consider providing MashalJSON and UnmarshalJSON
// instead of using struct tags.
var (
ErrInvalid = errors.New("invalid N-Quad")
ErrIncomplete = errors.New("incomplete N-Quad")
)
// Our triple struct, used throughout.
type Triple struct {
Subject string `json:"subject"`
Predicate string `json:"predicate"`
Object string `json:"object"`
Provenance string `json:"provenance,omitempty"`
type Quad struct {
Subject string `json:"subject"`
Predicate string `json:"predicate"`
Object string `json:"object"`
Label string `json:"label,omitempty"`
}
// Direction specifies an edge's type.
@ -57,7 +63,7 @@ const (
Subject
Predicate
Object
Provenance
Label
)
func (d Direction) Prefix() byte {
@ -68,7 +74,7 @@ func (d Direction) Prefix() byte {
return 's'
case Predicate:
return 'p'
case Provenance:
case Label:
return 'c'
case Object:
return 'o'
@ -85,8 +91,8 @@ func (d Direction) String() string {
return "subject"
case Predicate:
return "predicate"
case Provenance:
return "provenance"
case Label:
return "label"
case Object:
return "object"
default:
@ -98,45 +104,48 @@ func (d Direction) String() string {
// instead of the pointer. This needs benchmarking to make the decision.
// Per-field accessor for triples
func (t *Triple) Get(d Direction) string {
func (q *Quad) Get(d Direction) string {
switch d {
case Subject:
return t.Subject
return q.Subject
case Predicate:
return t.Predicate
case Provenance:
return t.Provenance
return q.Predicate
case Label:
return q.Label
case Object:
return t.Object
return q.Object
default:
panic(d.String())
}
}
func (t *Triple) Equals(o *Triple) bool {
return *t == *o
func (q *Quad) Equals(o *Quad) bool {
return *q == *o
}
// Pretty-prints a triple.
func (t *Triple) String() string {
// TODO(kortschak) String methods should generally not terminate in '\n'.
return fmt.Sprintf("%s -- %s -> %s\n", t.Subject, t.Predicate, t.Object)
func (q *Quad) String() string {
return fmt.Sprintf("%s -- %s -> %s", q.Subject, q.Predicate, q.Object)
}
func (t *Triple) IsValid() bool {
return t.Subject != "" && t.Predicate != "" && t.Object != ""
func (q *Quad) IsValid() bool {
return q.Subject != "" && q.Predicate != "" && q.Object != ""
}
// TODO(kortschak) NTriple looks like a good candidate for conversion
// to MarshalText() (text []byte, err error) and then move parsing code
// from nquads to here to provide UnmarshalText(text []byte) error.
// Prints a triple in N-Triple format.
func (t *Triple) NTriple() string {
if t.Provenance == "" {
// Prints a triple in N-Quad format.
func (q *Quad) NTriple() string {
if q.Label == "" {
//TODO(barakmich): Proper escaping.
return fmt.Sprintf("%s %s %s .", t.Subject, t.Predicate, t.Object)
return fmt.Sprintf("%s %s %s .", q.Subject, q.Predicate, q.Object)
} else {
return fmt.Sprintf("%s %s %s %s .", t.Subject, t.Predicate, t.Object, t.Provenance)
return fmt.Sprintf("%s %s %s %s .", q.Subject, q.Predicate, q.Object, q.Label)
}
}
type Unmarshaler interface {
Unmarshal() (*Quad, error)
}

View file

@ -22,6 +22,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func getStrings(obj *otto.Object, field string) []string {
@ -135,17 +136,17 @@ func buildInOutIterator(obj *otto.Object, ts graph.TripleStore, base graph.Itera
tags = makeListOfStringsFromArrayValue(one.Object())
}
for _, tag := range tags {
predicateNodeIterator.AddTag(tag)
predicateNodeIterator.Tagger().Add(tag)
}
}
in, out := graph.Subject, graph.Object
in, out := quad.Subject, quad.Object
if isReverse {
in, out = out, in
}
lto := iterator.NewLinksTo(ts, base, in)
and := iterator.NewAnd()
and.AddSubIterator(iterator.NewLinksTo(ts, predicateNodeIterator, graph.Predicate))
and.AddSubIterator(iterator.NewLinksTo(ts, predicateNodeIterator, quad.Predicate))
and.AddSubIterator(lto)
return iterator.NewHasA(ts, and, out)
}
@ -179,7 +180,7 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
case "tag":
it = subIt
for _, tag := range stringArgs {
it.AddTag(tag)
it.Tagger().Add(tag)
}
case "save":
all := ts.NodesAllIterator()
@ -187,16 +188,16 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
return iterator.NewNull()
}
if len(stringArgs) == 2 {
all.AddTag(stringArgs[1])
all.Tagger().Add(stringArgs[1])
} else {
all.AddTag(stringArgs[0])
all.Tagger().Add(stringArgs[0])
}
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Object))
hasa := iterator.NewHasA(ts, subAnd, graph.Subject)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Object))
hasa := iterator.NewHasA(ts, subAnd, quad.Subject)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)
@ -207,16 +208,16 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
return iterator.NewNull()
}
if len(stringArgs) == 2 {
all.AddTag(stringArgs[1])
all.Tagger().Add(stringArgs[1])
} else {
all.AddTag(stringArgs[0])
all.Tagger().Add(stringArgs[0])
}
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, graph.Subject))
hasa := iterator.NewHasA(ts, subAnd, graph.Object)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, all, quad.Subject))
hasa := iterator.NewHasA(ts, subAnd, quad.Object)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)
@ -232,9 +233,9 @@ func buildIteratorTreeHelper(obj *otto.Object, ts graph.TripleStore, base graph.
predFixed := ts.FixedIterator()
predFixed.Add(ts.ValueOf(stringArgs[0]))
subAnd := iterator.NewAnd()
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, graph.Object))
hasa := iterator.NewHasA(ts, subAnd, graph.Subject)
subAnd.AddSubIterator(iterator.NewLinksTo(ts, predFixed, quad.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(ts, fixed, quad.Object))
hasa := iterator.NewHasA(ts, subAnd, quad.Subject)
and := iterator.NewAnd()
and.AddSubIterator(hasa)
and.AddSubIterator(subIt)

View file

@ -84,7 +84,7 @@ func setupGremlin(env *otto.Otto, ses *Session) {
graph.Set("Emit", func(call otto.FunctionCall) otto.Value {
value := call.Argument(0)
if value.IsDefined() {
ses.SendResult(&GremlinResult{metaresult: false, err: "", val: &value, actualResults: nil})
ses.SendResult(&Result{val: &value})
}
return otto.NullValue()
})

View file

@ -38,7 +38,7 @@ func embedFinals(env *otto.Otto, ses *Session, obj *otto.Object) {
func allFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
ses.limit = -1
ses.count = 0
runIteratorOnSession(it, ses)
@ -51,7 +51,7 @@ func limitFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.Functio
if len(call.ArgumentList) > 0 {
limitVal, _ := call.Argument(0).ToInteger()
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
ses.limit = int(limitVal)
ses.count = 0
runIteratorOnSession(it, ses)
@ -63,7 +63,7 @@ func limitFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.Functio
func toArrayFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := -1
if len(call.ArgumentList) > 0 {
limitParsed, _ := call.Argument(0).ToInteger()
@ -90,7 +90,7 @@ func toArrayFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool)
func toValueFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := 1
var val otto.Value
var err error
@ -120,7 +120,7 @@ func toValueFunc(env *otto.Otto, ses *Session, obj *otto.Object, withTags bool)
func mapFunc(env *otto.Otto, ses *Session, obj *otto.Object) func(otto.FunctionCall) otto.Value {
return func(call otto.FunctionCall) otto.Value {
it := buildIteratorTree(obj, ses.ts)
it.AddTag(TopResultTag)
it.Tagger().Add(TopResultTag)
limit := -1
if len(call.ArgumentList) == 0 {
return otto.NullValue()
@ -148,10 +148,12 @@ func runIteratorToArray(it graph.Iterator, ses *Session, limit int) []map[string
count := 0
it, _ = it.Optimize()
for {
if ses.doHalt {
select {
case <-ses.kill:
return nil
default:
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
@ -163,8 +165,10 @@ func runIteratorToArray(it graph.Iterator, ses *Session, limit int) []map[string
break
}
for it.NextResult() == true {
if ses.doHalt {
select {
case <-ses.kill:
return nil
default:
}
tags := make(map[string]graph.Value)
it.TagResults(tags)
@ -184,10 +188,12 @@ func runIteratorToArrayNoTags(it graph.Iterator, ses *Session, limit int) []stri
count := 0
it, _ = it.Optimize()
for {
if ses.doHalt {
select {
case <-ses.kill:
return nil
default:
}
val, ok := it.Next()
val, ok := graph.Next(it)
if !ok {
break
}
@ -205,10 +211,12 @@ func runIteratorWithCallback(it graph.Iterator, ses *Session, callback otto.Valu
count := 0
it, _ = it.Optimize()
for {
if ses.doHalt {
select {
case <-ses.kill:
return
default:
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
@ -221,8 +229,10 @@ func runIteratorWithCallback(it graph.Iterator, ses *Session, callback otto.Valu
break
}
for it.NextResult() == true {
if ses.doHalt {
select {
case <-ses.kill:
return
default:
}
tags := make(map[string]graph.Value)
it.TagResults(tags)
@ -238,35 +248,36 @@ func runIteratorWithCallback(it graph.Iterator, ses *Session, callback otto.Valu
}
func runIteratorOnSession(it graph.Iterator, ses *Session) {
if ses.lookingForQueryShape {
iterator.OutputQueryShapeForIterator(it, ses.ts, ses.queryShape)
if ses.wantShape {
iterator.OutputQueryShapeForIterator(it, ses.ts, ses.shape)
return
}
it, _ = it.Optimize()
glog.V(2).Infoln(it.DebugString(0))
for {
// TODO(barakmich): Better halting.
if ses.doHalt {
select {
case <-ses.kill:
return
default:
}
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
tags := make(map[string]graph.Value)
it.TagResults(tags)
cont := ses.SendResult(&GremlinResult{metaresult: false, err: "", val: nil, actualResults: &tags})
if !cont {
if !ses.SendResult(&Result{actualResults: &tags}) {
break
}
for it.NextResult() == true {
if ses.doHalt {
select {
case <-ses.kill:
return
default:
}
tags := make(map[string]graph.Value)
it.TagResults(tags)
cont := ses.SendResult(&GremlinResult{metaresult: false, err: "", val: nil, actualResults: &tags})
if !cont {
if !ses.SendResult(&Result{actualResults: &tags}) {
break
}
}

View file

@ -20,6 +20,8 @@ import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
_ "github.com/google/cayley/graph/memstore"
)
@ -36,7 +38,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +52,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestSession(data []*graph.Triple) *Session {
func makeTestSession(data []*quad.Quad) *Session {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, t := range data {
ts.AddTriple(t)
@ -244,14 +246,14 @@ var testQueries = []struct {
},
}
func runQueryGetTag(g []*graph.Triple, query string, tag string) []string {
func runQueryGetTag(g []*quad.Quad, query string, tag string) []string {
js := makeTestSession(g)
c := make(chan interface{}, 5)
js.ExecInput(query, c, -1)
var results []string
for res := range c {
data := res.(*GremlinResult)
data := res.(*Result)
if data.val == nil {
val := (*data.actualResults)[tag]
if val != nil {

View file

@ -18,52 +18,51 @@ import (
"errors"
"fmt"
"sort"
"sync"
"time"
"github.com/robertkrimen/otto"
"github.com/google/cayley/graph"
"github.com/google/cayley/query"
)
var ErrKillTimeout = errors.New("query timed out")
type Session struct {
ts graph.TripleStore
currentChannel chan interface{}
env *otto.Otto
debug bool
limit int
count int
dataOutput []interface{}
lookingForQueryShape bool
queryShape map[string]interface{}
err error
script *otto.Script
doHalt bool
timeoutSec time.Duration
emptyEnv *otto.Otto
ts graph.TripleStore
results chan interface{}
env *otto.Otto
envLock sync.Mutex
debug bool
limit int
count int
dataOutput []interface{}
wantShape bool
shape map[string]interface{}
err error
script *otto.Script
kill chan struct{}
timeout time.Duration
emptyEnv *otto.Otto
}
func NewSession(inputTripleStore graph.TripleStore, timeoutSec int, persist bool) *Session {
var g Session
g.ts = inputTripleStore
func NewSession(ts graph.TripleStore, timeout time.Duration, persist bool) *Session {
g := Session{
ts: ts,
limit: -1,
timeout: timeout,
}
g.env = BuildEnviron(&g)
g.limit = -1
g.count = 0
g.lookingForQueryShape = false
if persist {
g.emptyEnv = g.env
}
if timeoutSec < 0 {
g.timeoutSec = time.Duration(-1)
} else {
g.timeoutSec = time.Duration(timeoutSec)
}
g.ClearJson()
return &g
}
type GremlinResult struct {
type Result struct {
metaresult bool
err string
err error
val *otto.Value
actualResults *map[string]graph.Value
}
@ -72,33 +71,35 @@ func (s *Session) ToggleDebug() {
s.debug = !s.debug
}
func (s *Session) GetQuery(input string, output_struct chan map[string]interface{}) {
defer close(output_struct)
s.queryShape = make(map[string]interface{})
s.lookingForQueryShape = true
func (s *Session) GetQuery(input string, out chan map[string]interface{}) {
defer close(out)
s.shape = make(map[string]interface{})
s.wantShape = true
s.env.Run(input)
output_struct <- s.queryShape
s.queryShape = nil
out <- s.shape
s.shape = nil
}
func (s *Session) InputParses(input string) (graph.ParseResult, error) {
func (s *Session) InputParses(input string) (query.ParseResult, error) {
script, err := s.env.Compile("", input)
if err != nil {
return graph.ParseFail, err
return query.ParseFail, err
}
s.script = script
return graph.Parsed, nil
return query.Parsed, nil
}
func (s *Session) SendResult(result *GremlinResult) bool {
func (s *Session) SendResult(r *Result) bool {
if s.limit >= 0 && s.limit == s.count {
return false
}
if s.doHalt {
select {
case <-s.kill:
return false
default:
}
if s.currentChannel != nil {
s.currentChannel <- result
if s.results != nil {
s.results <- r
s.count++
if s.limit >= 0 && s.limit == s.count {
return false
@ -109,42 +110,46 @@ func (s *Session) SendResult(result *GremlinResult) bool {
return false
}
var halt = errors.New("Query Timeout")
func (s *Session) runUnsafe(input interface{}) (otto.Value, error) {
s.doHalt = false
s.kill = make(chan struct{})
defer func() {
if caught := recover(); caught != nil {
if caught == halt {
s.err = halt
if r := recover(); r != nil {
if r == ErrKillTimeout {
s.err = ErrKillTimeout
return
}
panic(caught) // Something else happened, repanic!
panic(r)
}
}()
s.env.Interrupt = make(chan func(), 1) // The buffer prevents blocking
// Use buffered chan to prevent blocking.
s.env.Interrupt = make(chan func(), 1)
if s.timeoutSec != -1 {
if s.timeout >= 0 {
go func() {
time.Sleep(s.timeoutSec * time.Second) // Stop after two seconds
s.doHalt = true
time.Sleep(s.timeout)
close(s.kill)
s.envLock.Lock()
defer s.envLock.Unlock()
if s.env != nil {
s.env.Interrupt <- func() {
panic(halt)
panic(ErrKillTimeout)
}
s.env = s.emptyEnv
}
}()
}
return s.env.Run(input) // Here be dragons (risky code)
s.envLock.Lock()
env := s.env
s.envLock.Unlock()
return env.Run(input)
}
func (s *Session) ExecInput(input string, out chan interface{}, limit int) {
defer close(out)
s.err = nil
s.currentChannel = out
s.results = out
var err error
var value otto.Value
if s.script == nil {
@ -152,28 +157,23 @@ func (s *Session) ExecInput(input string, out chan interface{}, limit int) {
} else {
value, err = s.runUnsafe(s.script)
}
if err != nil {
out <- &GremlinResult{metaresult: true,
err: err.Error(),
val: &value,
actualResults: nil}
} else {
out <- &GremlinResult{metaresult: true,
err: "",
val: &value,
actualResults: nil}
out <- &Result{
metaresult: true,
err: err,
val: &value,
}
s.currentChannel = nil
s.results = nil
s.script = nil
s.envLock.Lock()
s.env = s.emptyEnv
return
s.envLock.Unlock()
}
func (s *Session) ToText(result interface{}) string {
data := result.(*GremlinResult)
data := result.(*Result)
if data.metaresult {
if data.err != "" {
return fmt.Sprintln("Error: ", data.err)
if data.err != nil {
return fmt.Sprintf("Error: %v\n", data.err)
}
if data.val != nil {
s, _ := data.val.Export()
@ -220,8 +220,8 @@ func (s *Session) ToText(result interface{}) string {
}
// Web stuff
func (ses *Session) BuildJson(result interface{}) {
data := result.(*GremlinResult)
func (s *Session) BuildJson(result interface{}) {
data := result.(*Result)
if !data.metaresult {
if data.val == nil {
obj := make(map[string]string)
@ -234,33 +234,34 @@ func (ses *Session) BuildJson(result interface{}) {
}
sort.Strings(tagKeys)
for _, k := range tagKeys {
obj[k] = ses.ts.NameOf((*tags)[k])
obj[k] = s.ts.NameOf((*tags)[k])
}
ses.dataOutput = append(ses.dataOutput, obj)
s.dataOutput = append(s.dataOutput, obj)
} else {
if data.val.IsObject() {
export, _ := data.val.Export()
ses.dataOutput = append(ses.dataOutput, export)
s.dataOutput = append(s.dataOutput, export)
} else {
strVersion, _ := data.val.ToString()
ses.dataOutput = append(ses.dataOutput, strVersion)
s.dataOutput = append(s.dataOutput, strVersion)
}
}
}
}
func (ses *Session) GetJson() (interface{}, error) {
defer ses.ClearJson()
if ses.err != nil {
return nil, ses.err
func (s *Session) GetJson() ([]interface{}, error) {
defer s.ClearJson()
if s.err != nil {
return nil, s.err
}
if ses.doHalt {
return nil, halt
select {
case <-s.kill:
return nil, ErrKillTimeout
default:
return s.dataOutput, nil
}
return ses.dataOutput, nil
}
func (ses *Session) ClearJson() {
ses.dataOutput = nil
func (s *Session) ClearJson() {
s.dataOutput = nil
}

View file

@ -23,6 +23,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func (q *Query) buildFixed(s string) graph.Iterator {
@ -33,7 +34,7 @@ func (q *Query) buildFixed(s string) graph.Iterator {
func (q *Query) buildResultIterator(path Path) graph.Iterator {
all := q.ses.ts.NodesAllIterator()
all.AddTag(string(path))
all.Tagger().Add(string(path))
return all
}
@ -97,7 +98,7 @@ func (q *Query) buildIteratorTreeInternal(query interface{}, path Path) (it grap
if err != nil {
return nil, false, err
}
it.AddTag(string(path))
it.Tagger().Add(string(path))
return it, optional, nil
}
@ -139,16 +140,16 @@ func (q *Query) buildIteratorTreeMapInternal(query map[string]interface{}, path
subAnd := iterator.NewAnd()
predFixed := q.ses.ts.FixedIterator()
predFixed.Add(q.ses.ts.ValueOf(pred))
subAnd.AddSubIterator(iterator.NewLinksTo(q.ses.ts, predFixed, graph.Predicate))
subAnd.AddSubIterator(iterator.NewLinksTo(q.ses.ts, predFixed, quad.Predicate))
if reverse {
lto := iterator.NewLinksTo(q.ses.ts, builtIt, graph.Subject)
lto := iterator.NewLinksTo(q.ses.ts, builtIt, quad.Subject)
subAnd.AddSubIterator(lto)
hasa := iterator.NewHasA(q.ses.ts, subAnd, graph.Object)
hasa := iterator.NewHasA(q.ses.ts, subAnd, quad.Object)
subit = hasa
} else {
lto := iterator.NewLinksTo(q.ses.ts, builtIt, graph.Object)
lto := iterator.NewLinksTo(q.ses.ts, builtIt, quad.Object)
subAnd.AddSubIterator(lto)
hasa := iterator.NewHasA(q.ses.ts, subAnd, graph.Subject)
hasa := iterator.NewHasA(q.ses.ts, subAnd, quad.Subject)
subit = hasa
}
}

View file

@ -21,6 +21,7 @@ import (
"github.com/google/cayley/graph"
_ "github.com/google/cayley/graph/memstore"
"github.com/google/cayley/quad"
)
// This is a simple test graph.
@ -36,7 +37,7 @@ import (
// \-->|#D#|------------->+---+
// +---+
//
var simpleGraph = []*graph.Triple{
var simpleGraph = []*quad.Quad{
{"A", "follows", "B", ""},
{"C", "follows", "B", ""},
{"C", "follows", "D", ""},
@ -50,7 +51,7 @@ var simpleGraph = []*graph.Triple{
{"G", "status", "cool", "status_graph"},
}
func makeTestSession(data []*graph.Triple) *Session {
func makeTestSession(data []*quad.Quad) *Session {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, t := range data {
ts.AddTriple(t)
@ -164,7 +165,7 @@ var testQueries = []struct {
},
}
func runQuery(g []*graph.Triple, query string) interface{} {
func runQuery(g []*quad.Quad, query string) interface{} {
s := makeTestSession(g)
c := make(chan interface{}, 5)
go s.ExecInput(query, c, -1)

View file

@ -23,6 +23,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/query"
)
type Session struct {
@ -62,13 +63,13 @@ func (m *Session) GetQuery(input string, output_struct chan map[string]interface
output_struct <- output
}
func (s *Session) InputParses(input string) (graph.ParseResult, error) {
func (s *Session) InputParses(input string) (query.ParseResult, error) {
var x interface{}
err := json.Unmarshal([]byte(input), &x)
if err != nil {
return graph.ParseFail, err
return query.ParseFail, err
}
return graph.Parsed, nil
return query.Parsed, nil
}
func (s *Session) ExecInput(input string, c chan interface{}, limit int) {
@ -88,7 +89,7 @@ func (s *Session) ExecInput(input string, c chan interface{}, limit int) {
glog.V(2).Infoln(it.DebugString(0))
}
for {
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}
@ -130,7 +131,7 @@ func (s *Session) BuildJson(result interface{}) {
s.currentQuery.treeifyResult(result.(map[string]graph.Value))
}
func (s *Session) GetJson() (interface{}, error) {
func (s *Session) GetJson() ([]interface{}, error) {
s.currentQuery.buildResults()
if s.currentQuery.isError() {
return nil, s.currentQuery.err

View file

@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
package graph
package query
// Defines the graph session interface general to all query languages.
@ -39,7 +39,7 @@ type HttpSession interface {
ExecInput(string, chan interface{}, int)
GetQuery(string, chan map[string]interface{})
BuildJson(interface{})
GetJson() (interface{}, error)
GetJson() ([]interface{}, error)
ClearJson()
ToggleDebug()
}

View file

@ -19,6 +19,7 @@ import (
"github.com/google/cayley/graph"
"github.com/google/cayley/graph/iterator"
"github.com/google/cayley/quad"
)
func BuildIteratorTreeForQuery(ts graph.TripleStore, query string) graph.Iterator {
@ -189,7 +190,7 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
nodeID := getIdentString(tree)
if tree.Children[0].Name == "Variable" {
allIt := ts.NodesAllIterator()
allIt.AddTag(nodeID)
allIt.Tagger().Add(nodeID)
out = allIt
} else {
n := nodeID
@ -208,7 +209,7 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
i++
}
it := buildIteratorTree(tree.Children[i], ts)
lto := iterator.NewLinksTo(ts, it, graph.Predicate)
lto := iterator.NewLinksTo(ts, it, quad.Predicate)
return lto
case "RootConstraint":
constraintCount := 0
@ -229,16 +230,16 @@ func buildIteratorTree(tree *peg.ExpressionTree, ts graph.TripleStore) graph.Ite
return and
case "Constraint":
var hasa *iterator.HasA
topLevelDir := graph.Subject
subItDir := graph.Object
topLevelDir := quad.Subject
subItDir := quad.Object
subAnd := iterator.NewAnd()
isOptional := false
for _, c := range tree.Children {
switch c.Name {
case "PredIdentifier":
if c.Children[0].Name == "Reverse" {
topLevelDir = graph.Object
subItDir = graph.Subject
topLevelDir = quad.Object
subItDir = quad.Subject
}
it := buildIteratorTree(c, ts)
subAnd.AddSubIterator(it)

View file

@ -18,6 +18,8 @@ import (
"testing"
"github.com/google/cayley/graph"
"github.com/google/cayley/quad"
_ "github.com/google/cayley/graph/memstore"
)
@ -30,21 +32,21 @@ func TestBadParse(t *testing.T) {
var testQueries = []struct {
message string
add *graph.Triple
add *quad.Quad
query string
typ graph.Type
expect string
}{
{
message: "get a single triple linkage",
add: &graph.Triple{"i", "can", "win", ""},
add: &quad.Quad{"i", "can", "win", ""},
query: "($a (:can \"win\"))",
typ: graph.And,
expect: "i",
},
{
message: "get a single triple linkage",
add: &graph.Triple{"i", "can", "win", ""},
add: &quad.Quad{"i", "can", "win", ""},
query: "(\"i\" (:can $a))",
typ: graph.And,
expect: "i",
@ -65,7 +67,7 @@ func TestMemstoreBackedSexp(t *testing.T) {
if it.Type() != test.typ {
t.Errorf("Incorrect type for %s, got:%q expect %q", test.message, it.Type(), test.expect)
}
got, ok := it.Next()
got, ok := graph.Next(it)
if !ok {
t.Errorf("Failed to %s", test.message)
}
@ -77,8 +79,8 @@ func TestMemstoreBackedSexp(t *testing.T) {
func TestTreeConstraintParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
ts.AddTriple(&graph.Triple{"i", "like", "food", ""})
ts.AddTriple(&graph.Triple{"food", "is", "good", ""})
ts.AddTriple(&quad.Quad{"i", "like", "food", ""})
ts.AddTriple(&quad.Quad{"food", "is", "good", ""})
query := "(\"i\"\n" +
"(:like\n" +
"($a (:is :good))))"
@ -86,7 +88,7 @@ func TestTreeConstraintParse(t *testing.T) {
if it.Type() != graph.And {
t.Error("Odd iterator tree. Got: %s", it.DebugString(0))
}
out, ok := it.Next()
out, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
@ -97,13 +99,13 @@ func TestTreeConstraintParse(t *testing.T) {
func TestTreeConstraintTagParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
ts.AddTriple(&graph.Triple{"i", "like", "food", ""})
ts.AddTriple(&graph.Triple{"food", "is", "good", ""})
ts.AddTriple(&quad.Quad{"i", "like", "food", ""})
ts.AddTriple(&quad.Quad{"food", "is", "good", ""})
query := "(\"i\"\n" +
"(:like\n" +
"($a (:is :good))))"
it := BuildIteratorTreeForQuery(ts, query)
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
@ -117,7 +119,7 @@ func TestTreeConstraintTagParse(t *testing.T) {
func TestMultipleConstraintParse(t *testing.T) {
ts, _ := graph.NewTripleStore("memstore", "", nil)
for _, tv := range []*graph.Triple{
for _, tv := range []*quad.Quad{
{"i", "like", "food", ""},
{"i", "like", "beer", ""},
{"you", "like", "beer", ""},
@ -133,14 +135,14 @@ func TestMultipleConstraintParse(t *testing.T) {
if it.Type() != graph.And {
t.Error("Odd iterator tree. Got: %s", it.DebugString(0))
}
out, ok := it.Next()
out, ok := graph.Next(it)
if !ok {
t.Error("Got no results")
}
if out != ts.ValueOf("i") {
t.Errorf("Got %d, expected %d", out, ts.ValueOf("i"))
}
_, ok = it.Next()
_, ok = graph.Next(it)
if ok {
t.Error("Too many results")
}

View file

@ -22,6 +22,7 @@ import (
"sort"
"github.com/google/cayley/graph"
"github.com/google/cayley/query"
)
type Session struct {
@ -39,7 +40,7 @@ func (s *Session) ToggleDebug() {
s.debug = !s.debug
}
func (s *Session) InputParses(input string) (graph.ParseResult, error) {
func (s *Session) InputParses(input string) (query.ParseResult, error) {
var parenDepth int
for i, x := range input {
if x == '(' {
@ -52,17 +53,17 @@ func (s *Session) InputParses(input string) (graph.ParseResult, error) {
if (i - 10) > min {
min = i - 10
}
return graph.ParseFail, errors.New(fmt.Sprintf("Too many close parens at char %d: %s", i, input[min:i]))
return query.ParseFail, errors.New(fmt.Sprintf("Too many close parens at char %d: %s", i, input[min:i]))
}
}
}
if parenDepth > 0 {
return graph.ParseMore, nil
return query.ParseMore, nil
}
if len(ParseString(input)) > 0 {
return graph.Parsed, nil
return query.Parsed, nil
}
return graph.ParseFail, errors.New("Invalid Syntax")
return query.ParseFail, errors.New("Invalid Syntax")
}
func (s *Session) ExecInput(input string, out chan interface{}, limit int) {
@ -77,7 +78,7 @@ func (s *Session) ExecInput(input string, out chan interface{}, limit int) {
}
nResults := 0
for {
_, ok := it.Next()
_, ok := graph.Next(it)
if !ok {
break
}

View file

@ -48,7 +48,7 @@ $(function() {
subject: $("#subject").val(),
predicate: $("#predicate").val(),
object: $("#object").val(),
provenance: $("#provenance").val()
label: $("#label").val()
}
if (!checkTriple(triple)) {
return
@ -68,7 +68,7 @@ $(function() {
subject: $("#rsubject").val(),
predicate: $("#rpredicate").val(),
object: $("#robject").val(),
provenance: $("#rprovenance").val()
label: $("#rlabel").val()
}
if (!checkTriple(triple)) {
return

View file

@ -45,7 +45,7 @@
<input id="subject" type="text" placeholder="Subject"></input>
<input id="predicate" type="text" placeholder="Predicate"></input>
<input id="object" type="text" placeholder="Object"></input>
<input id="provenance" type="text" placeholder="Provenance"></input>
<input id="label" type="text" placeholder="Label"></input>
</div>
</div>
<div class="row button-row">
@ -59,7 +59,7 @@
<input id="rsubject" type="text" placeholder="Subject"></input>
<input id="rpredicate" type="text" placeholder="Predicate"></input>
<input id="robject" type="text" placeholder="Object"></input>
<input id="rprovenance" type="text" placeholder="Provenance"></input>
<input id="rlabel" type="text" placeholder="Label"></input>
</div>
</div><!-- /.col-xs-12 main -->
<div class="row button-row">

9
testdata.nq Normal file
View file

@ -0,0 +1,9 @@
<alice> <follows> <bob> .
<bob> <follows> <alice> .
<charlie> <follows> <bob> .
<dani> <follows> <charlie> .
<dani> <follows> <alice> .
<alice> <is> "cool" .
<bob> <is> "not cool" .
<charlie> <is> "cool" .
<dani> <is> "not cool" .

View file

@ -1,9 +0,0 @@
alice follows bob .
bob follows alice .
charlie follows bob .
dani follows charlie .
dani follows alice .
alice is cool .
bob is "not cool" .
charlie is cool .
dani is "not cool" .