Note for readers – I wrote this patch a year back and it no longer works. Please treat this just as a reference code.
Last Tested with:
Drill 1.2.0
Cassandra 2.2.0
In this crisp post I would be talking about Drill’s Cassandra Storage plugin which would enable us to query Cassandra via Apache Drill. That also means that we would be able to issue ANSI SQL queries on Cassandra which is not inherently supported on Cassandra.
All the code : https://github.com/yssharma/drill/tree/cassandra-storage Patch: https://gist.github.com/yssharma/2581ae8a97c559b2677f
There are couple of steps we would need to setup Cassandra storage before we can start playing with Cassandra and Drill. Download the patch and save in file. (Here:DRILL-92-CassandraStorage.patch)
1. Get Drill: Lets get the Drill source
$> git clone https://github.com/apache/drill.git
2. Get Cassandra Storage patch:
Download the Patch file from
https://reviews.apache.org/r/29816/diff/raw/
3. Apply the patch on top of Drill
$> cd drill $> git apply --check ~/Downloads/DRILL-92-CassandraStorage.patch $> git apply ~/Downloads/DRILL-92-CassandraStorage.patch
4. Build Drill with Cassandra Storage & export distribution to /opt/drill
$> mvn clean install -DskipTests $> mkdir /opt/drill $> tar xvzf distribution/target/*.tar.gz --strip=1 -C /opt/drill
5. Start Sqlline.
That it we have finished with the Drill build and installation – and its time we can start using Drill.
$> cd /opt/drill $> bin/sqlline -u jdbc:drill:zk=local -n admin -p admin
Hit ‘show schemas‘ to view existing schemas.
6. Drill Web interface
No we should be able to see the Drill web interface on localhost:8047.
7. Configure Cassandra Plugin :
Now its time we configure our Cassandra with Drill.
Go to the Storage page from top navigation bar & add a new plugin by name ‘cassandra’. On the next page provide the details of your Cassandra installation.
Here is the config I used:
New Storage plugin format: { "type": "cassandra", "config": { "cassandra.hosts": [ "127.0.0.1", "127.0.0.2" ], "cassandra.port": 9042 }, "enabled": true }
Thats it. Enough work. Its playtime now.
8. Query Cassandra.
Its time we can start querying Cassandra via Drill.
Go to the Query page from top navigation menu and Fire your Sql query on existing Cassandra tables.
Note: Make sure Cassandra is up and running.
The general query format would be like- SELECT * FROM cassandra.<keyspace_name>.<table_name> LIMIT 10;
Stop Sqlline and Restart if required.
Cool. Try some complex SQL now. Play Around.
We can also explore the existing Schemas via Sqlline:
Thats all for this post. Hope it was helpful.
See ya all soon. Cheers \m/
Pingback: How to Use Apache Drill with Cassandra - CSS PHP
Hi,
We have the urgent requirement to querying the cassandra on top of drill. For that purpose I’m following as mentioned above but when running
“git apply –check ~/Downloads/DRILL-92-CassandraStorage.patch” I’m getting error as below,
“fatal: corrupt patch at line 3308”.
Please help me out in achieving the cassandra querying through drill. Your help would be greatly appreciated.
Thanks in advance.
Hello,
I tried to post the cassandra configuration as you did as:
{
“type”: “cassandra”,
“config”: {
“cassandra.hosts”: [
“127.0.0.1”,
“127.0.0.2”
],
“cassandra.port”: 9042
},
“enabled”: true
}
I also tried to change the cassandra.hosts as node9 (we name the server) or IP (192.168.168.29). But each time I click create, it always says
please retry: error (invalid JSON mapping)
Would you mind to point at what is the problem?
thanks
I have another question about applying patch to drill
Here is what I got by applying patch:
[root@node9 drill]# git apply –check /opt/DRILL-92-CassandraStorage.patch
error: patch failed: contrib/pom.xml:37
error: contrib/pom.xml: patch does not apply
error: patch failed: distribution/pom.xml:160
error: distribution/pom.xml: patch does not apply
error: patch failed: distribution/src/assemble/bin.xml:92
error: distribution/src/assemble/bin.xml: patch does not apply
Please kindly point out what is the solution for above errors.
thanks
I’m having the exact same problem and I would like some help. Did this ever get resolved for you?
thanks
we could not compile it.
mvn clean install
[INFO] Scanning for projects…
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]
[ERROR] The project org.apache.drill.contrib:drill-storage-cassandra:0.9.0-SNAPSHOT (/root/drill/contrib/storage-cassandra/pom.xml) has 1 error
[ERROR] Non-resolvable parent POM: Could not find artifact org.apache.drill.contrib:drill-contrib-parent:pom:0.9.0-SNAPSHOT and ‘parent.relativePath’ points at wrong local POM @ line 22, column 13 -> [Help 2]