^^ | daybreak ^^ \ _ / -= / \ =- ~^~ ^ ^~^~ ~^~ ~ ~^~~^~^-=~=~=-~^~^~^~
Daybreak is a simple and very fast key value store for ruby. It has user defined persistence, and all data is stored in a table in memory, so ruby niceties are available. Daybreak is faster than pstore and dbm.
The source is at Github and you can install it with:
$ gem install daybreak
(v0.3.0) | API Docs | Issue Tracker
Daybreak stores data in an append-only file, and values inserted into the database are marshalled ruby objects. It includes Enumerable for functional methods like map and reduce and emulates the interface of a simple ruby hash. Here is the basic api:
require 'daybreak'
db = Daybreak::DB.new "example.db"
# set the value of a key
db['foo'] = 2
# set the value of a key and flush the change to disk
db.set! 'bar', 2
# You can also use atomic batch updates
db.update :alpha => 1, :beta => 2
db.update! :alpha => 1, :beta => 2
# all keys are cast to strings via #to_s
db[1] = 2
db.keys.include? 1 # => false
db.keys.include? '1' # => true
# ensure changes are sent to disk
db.flush
# open up another db client
db2 = Daybreak::DB.new "example2.db"
db2['foo'] = 3
# Ruby objects work too
db2['baz'] = {:one => 1}
db2.flush
# Reread the changed file in the first db
db.load
p db['foo'] #=> 3
p db['baz'] #=> {:one => 1}
# Enumerable works too!
1000.times {|i| db[i] = i }
p db.reduce(0) {|m, k, v| m + k.last } # => 499500
# Compaction is always a good idea. It will cut down on the size of the Database
db.compact
p db['foo'] #=> 1
db2.load
p db2['foo'] #=> 1
# DBs can accessed from multiple processes at the same
# time. You can use #lock to make an operation atomic.
db.lock do
db['counter'] += 1
end
# If you want to synchronize only between threads, prefer synchronize over lock!
db.synchronize do
db['counter'] += 1
end
# DBs can have default values
db3 = Daybreak::DB.new "example3.db", :default => 'hello!'
db3['bingo'] #=> hello!
# If you don't like Marshal as serializer, you can write your own
# serializer. Inherit Daybreak::Serializer::Default
db4 = Daybreak::DB.new "example4.db", :serializer => MyJsonSerializer
# close the databases
db.close
db2.close
db3.close
db4.close
You can provide your own serializer, see Daybreak::Serializer::Default if you want a different serialization strategy (for example, JSON). You can also provide your own format, see Daybreak::Format if you want to format your database log differently.
When a Daybreak database is opened it reads the append only file and mirrors the data in an in memory hash table for fast reads.
Writes to a Daybreak database are asynchronous and each write is queued. If you want to commit immediately to the file call flush after a write.
Daybreak is multi process safe. Synchronization with the other processes is done by calling load or lock. load updates the in memory hash table with new database records from the filesystem. Use lock if you want to make operations atomic across process boundaries.
If you only want to synchronize between different threads, prefer synchronize over lock. Be aware that Daybreak is not thread-safe by default, so all (!) accesses have to be wrapped by synchronize (This statement is true at least on interpreters without global interpreter lock (Rubinius, JRuby)).
Writes with duplicate keys are simply appended to the end of the file. From time to time you will want to run compact which will remove old commits from the file and create a smaller logfile. This will shrink the space necessary to store the data on disk. You can also compact from a background process.
Daybreak stores its data in a very simple file format. Each Daybreak file is an append only log consisting of 32 bit big endian key length, 32 bit big endian value length, key data and value data. Every key-value pair also has an associated 32 bit CRC field to protect against bad data. The special value 0xFFFFFFFF for the value length denotes a deleted record. Here is how a database of one record might look:
32 bit Key length | 32 bit Value length | Key | Value | CRC32 |
---|---|---|---|---|
(...)0000101 | (...)0001010 | hello | <marshalled value> | (...)11010 |
These values are all read into an in memory hash table and commits to the database are queued for writing. A reminder: Call flush if you want commits to block and be written to the filesystem.
Daybreak is tested using Travis-CI. We also run benchmarks there, which compare Daybreak against DBM, GDBM and Hash.
If you are interested in benchmarks, you can also take a look at the Moneta benchmarks, where Daybreak is compared to virtually all existing key/value stores. It seems to be the fastest persistent database from all the Moneta backends.
============================================================================= Summary uniform_medium: 3 runs, 1000 keys ============================================================================= Minimum Maximum Total Mean Stddev Ops/s Memory sum 17 19 55 18 0 53725 Daybreak sum 20 26 68 22 2 44036 LevelDB sum 40 44 129 43 1 23176 TDB sum 40 53 148 49 6 20192 GDBM sum 39 70 151 50 14 19832 DBM sum 38 77 171 57 16 17491 LRUHash sum 56 99 211 70 20 14177 Sqlite sum 134 167 438 146 15 6845 File sum 333 444 1190 396 46 2519 HashFile sum 471 494 1451 483 9 2066 Redis sum 656 818 2218 739 65 1352 MemcachedDalli sum 700 1051 2532 844 150 1184 MemcachedNative sum 822 979 2661 887 66 1127 Client sum 906 970 2814 938 26 1065 Sequel sum 2090 2635 6992 2330 227 429 Mongo sum 2053 2704 7108 2369 265 422 DataMapper sum 7984 11287 27909 9303 1428 107 Couch sum 15481 18786 51336 17112 1349 58 Riak sum 15597 22437 56838 18946 2794 52 PStore sum 15975 26684 59356 19785 4887 50 ActiveRecord sum 27526 32525 89807 29935 2044 33 RestClient sum 122103 122781 367042 122347 307 8
Copyright (c) 2012 - 2013 ProPublica MIT License Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Daybreak is a project of ProPublica.