Deploying Cassandra with Puppet
Posted on August 6th, 2011
(this is update remake of the puppet recipe on my blog)
Cassandra is a peer-to-peer architecture which is typically deployed on a large number of servers. Deploying, managing, and upgrading these systems is more administrative time especially as your cluster grows. Puppet provides a simple way to install Cassandra.
Getting ready
For this recipe you will need a server running puppet.
How to do it…
Puppet has a file server built in. By default the root of the file server is /var/lib/puppet/files. Create a folder for cassandra. Download a release into that folder, rename the files that are different per cluster.
mkdir /var/lib/puppet/file/cassandra
cd /var/lib/puppet/files/cassandra
wget apache-cassandra-0.8.1.tar.gz
tar -xf apache-cassandra-0.8.1.tar.gz
mv apache-cassandra-0.8.1/conf/cassandra.yaml apache-cassandra-0.8.1/conf/cassandra.yaml.bak
mv apache-cassandra-0.8.1/conf/cassandra-env.sh apache-cassandra-0.8.1/conf/cassandra-env.sh.bak
create a puppet manifest for Cassandra /etc/puppet/manifests/cassandra.pp. Create the base requirements that and Cassandra instance would share.
class cassandra_base {
package { "jdk" :
ensure => installed
}
group { "cassandra": }
user { "cassandra":
home => "/home/cassandra",
shell => "/bin/bash",
require => Group["cassandra"]
}
file { ["/var/lib/cassandra","/var/log/cassandra","/var/lib/cassandra/saved_caches"]:
type => directory,
owner => cassandra,
group => cassandra,
ensure => directory,
mode =>755,
require => User["cassandra"]
}
file { "/etc/init.d/cassandra":
path => "/etc/init.d/cassandra",
mode => 744,
source => "puppet:///cassandra/cassandra.init",
require => Package["jdk"]
}
}
Now extend the base definition for a specific version of Cassandra.
class cassandra_0_8_2 inherits cassandra_base {
file {
"/usr/local/apache-cassandra-0.8.2":
owner => root,
group => root,
path => "/usr/local/apache-cassandra-0.8.2",
source => "puppet:///mainfiles/cassandra/apache-cassandra-0.8.2",
recurse => true
}
file {
"/usr/local/apache-cassandra-0.8.2/bin":
mode => 755,
path => "/usr/local/apache-cassandra-0.8.2/bin",
source => "puppet:///mainfiles/cassandra/apache-cassandra-0.8.2/bin",
recurse => true,
require => File [ "/usr/local/apache-cassandra-0.8.2" ]
}
file { "/usr/local/cassandra":
ensure => link,
target => "/usr/local/apache-cassandra-0.8.2",
require => File[ "/usr/local/apache-cassandra-0.8.2" ]
}
}
Extend this one more time per cluster. Use the fqdn function in puppet so each node can get its own copy of the configuration file.
class cassandra_lab_0_8_2 inherits cassandra_0_8_2 {
file { "/usr/local/cassandra/conf/cassandra-env.sh":
path => "/usr/local/cassandra/conf/cassandra-env.sh",
owner => "root",
group => "root",
source => "puppet:///cassandra/lab-0.8.2/cassandra-env.sh"
}
file { "/usr/local/cassandra/conf/cassandra.yaml":
path => "/usr/local/cassandra/conf/cassandra.yaml",
owner => "root",
group => "root",
source => "puppet:///cassandra/lab-0.8.2/cassandra.yaml."+fqdn()
}
}
Now setup the configuration files for this cluster.
mkdir /var/lib/puppet/files/cassandra/lab-0.8.2
cp /var/lib/puppet/files/cassandra/cassandra-0.8.2/conf/cassandra.yaml.bak /var/lib/puppet/files/cassandra/lab-0.8.2/cassandra.yaml.server1.domain.com
For each server in the cluster include the class we created.
node 'server1.domain.com','server2.domain.com'{
include cassandra_lab_0_8_2
}
How it works…
The puppet configuration management system copies files from the server to clients. This makes it easy to replicate installations across large clusters of servers. Puppet file directives copy since files or recursive directories from the puppet server to the client. Puppet also installs system users and groups as well as packages in a platform independent way.
Tags: cassandra, nosql, puppet
Filed under Chapter 7 Administration | 59 Comments »
Using jmap and jhat to analyze Java heap
Posted on August 6th, 2011
While Java takes care most of the details of memory and heap management, situations can occur where memory is never reclaimed. This recipe uses two java tools jmap and jhat, to capture a heap dump and examine it.
How to do it…
Determine the pid of a running java process
$ ps -ef | grep java
edward 3736 1 1 01:46 pts/0 00:00:06
Use the jmap tool to dump the heap to a file.
$ jmap -dump:file=b 3736
Dumping heap to /home/edward/hpcas/b …
Heap dump file created
Start a jhat web server on 7001 (this defaults to 7000 the cassandra storage port.
$jhat -port 7001 /home/edward/hpcas/b
Chasing references, expect 15 dots……………
Eliminating duplicate references……………
Snapshot resolved.
Started HTTP server on port 7001
Server is ready.
Use the web interface to explore the heap.
How it works…
Jhat and jmap allow you to capture and review heap dumps. This is a valuable tool when chasing down memory leaks or un-explained memory usage.
There is more…
Cassandra’s heap can be large and require a lot of ram to view in jmap. A trick for not impacting your production Cassandra servers. Is sometimes copying dump to another system and use jhat to view it there.
Tags: cassandra, jhat, jmap, nosql
Filed under Chapter 13 Monitoring | 62 Comments »
Why cookbook examples are in Java
Posted on August 4th, 2011
Someone on twitter mentioned:
BTW, woulda been really nice if the C* HP Cookbook spelled out it’s targeted for java developers
There are many reasons that the Cassandra High Performance Cookbook examples were written in Java:
- I am most proficient in Java. After Java my drop off in skill is fairly steep
- Apache Cassandra is written in java, and the most support is there.
- Thrift generates bindings for C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml, examples for each possible language would be hard.
- Each languages usually has one to two clients I.E. Java has Pelops and Hector
- reason 2 * reason 3 permutations of examples
- Thrift generated RPC stubs for every language have the same functionality. Client language is less important.
- Cassandra the Definitive Guide covered multiple languages http://oreilly.com/catalog/0636920010852. It took over 30 pages, all those examples got out of date fast (clients change), I feel I did not need to rehash that.
- For a person wanting to learn Cassandra I believe they do not get a lot out of the same information being rehashed for multiple languages.
- Each language has its own environment setup “issues” perl has cpan, c has system libraries, that information goes out of scope fast.
There is one recipe that is a footstep to other languages, chapter 3 has the recipe, Generating Thrift bindings for other languages (C++, PHP, and others). I will make an effort to add some blog entries to this site for other languages.
Filed under Uncategorized | 72 Comments »
