Why We Chose Riak


I have been playing around with Riak for the past month and came to the conclusion that Riak is a good  option for our project, for the following reasons;

Ever-Evolving Object Model
The highly adaptive nature of our object model is not a good fit for the traditional ORM on top of RDBMS, as the object model is highly customizable from customer and customer and may evolve from version to version. The transitional ORM would require RDBMS schema to continuously keep up with our ever-evolving object model, requiring enormous efforts on Engineering, Testing, and Operations.

The platform really does not care about the customized and highly evolved properties of object types. In other words, the platform only needs to know a pre-defined set of object properties for persistence and relationship resolution purpose and does not need to know all the other properties.
Riak, on the other hand, gives us the flexibility for storing opaque objects and we decide to store objects as JSON rather than Java objects or XML because JSON serialization is much more flexible and compact and needs far less storage than Java or XML.
High Availability and Multi-Data Center Support
Riak is built as a distributed data storage, with tunable read and write replica strategy.
Riak Enterprise offers multi-data center replication.

Free Text Search 
Riak comes with build-in free text search support, built on top of Lucene.

Adjacency Link Walking
Our object model relies on adjacent link between objects and it is critical to be able to follow the object graph through these adjacency links. Riak offers MapReduced based link walking functionality so we can easily retrieve all objects that are linked to a particular object through any levels of links.

Secondary Index Support
Like RDBMS, Riak offers secondary index support in addition to primary key lookup.

Multi-Tenant Support
Our platform must support multi-tenancy for security, partition and performance reasons, which is not trivial to accomplish in a RDBMS environment.

Riak, on the other hand, partitions data naturally in buckets and buckets are distributed across different nodes. Tenants can be mapped to buckets and data level security can be accomplished through securing access to buckets. If we store  a tenant related data in the same bucket, a user can only access the data if he has access to the bucket and he can't access any objects not belong to accessible buckets.

Ad Hoc Query Support Through MapReduce
Riak provides us the ability to run Ad Hoc queries through the entire data set, through a series Map and Reduce phases. The only limitation is that MapReduce is executed in memory and must complete with a timeout limit. This is not a major concern given the size of data set.

Performance
Riak is based on a distributed data model, which should perform better than master-slave type of model.

Operation and Monitoring Support
Riak ships with a UI monitoring tool and a set of commends for other administrative tasks like backup/restore, etc.

Concerns about Riak
We do have concerns regarding Riak from a business perspective. Even though Riak is an open source solution, its commercial backer Basho is still relatively young and the user community is not as big as Hadoop, Cassandra, or MongoDB.

To mitigate the risk, we built a persistence abstraction layer that allows us to swamp Riak with a different NoSQL technology in the future if necessary.

All Hail Selenium Grid 2

NOTE: this post is based on the selenium-server-standalone-2.16.1 release.

So we finally have Grid 2, Grid 2 is a complete re-write of Selenium Grid and brings us the following features:
  1. Support for Selenium-Webdriver (Selenium 2)
  2. Full backwards compatibility with Selenium-RC (Selenium 1)
  3. Significant optimisations both in terms of efficiency and functionality, the headline being you no longer need a 1:1 mapping betweeen RC instances and browsers which meant huge memory consumption. Now a single node server can support all browsers for that node.
If you have never heard of Selenium Grid i am talking about the ability to run distributed parallel Selenium tests across multiple machines, physical or virtual and all the administrative functionality that you would expect to need to support that.
I have only just started looking at Grid 2 so this is going to be a very brief overview. Hopefuly as I explore its functionality further i’ll do some more involved posts.


Pretty much everything you need to get started is on the Wiki page but below are a few things that might be useful to clarify in order to get the ‘out of the box’ behaviour working.

start a hub server :
java -jar selenium-server-standalone-2.16.1.jar -firefoxProfileTemplate /home/slaouini/tools/selenium-server/web

start a node server :
java -jar selenium-server-standalone-2.16.1.jar -role node -hub http://127.0.0.1:4444/grid/register -nodeConfig myconfig.json -firefoxProfileTemplate /home/slaouini/tools/selenium-server/web

By default the grid will start with a default set of browsers, 5 Firefox, 5 Chrome, 1 IE - you can change this configuration by supplying command line options when you start the node server, see the Selenium Wiki for more details.
Alternatively you can supply a config file in json format e.g.

our json config file :
{
    "class":"org.openqa.grid.common.RegistrationRequest",
    "capabilities":
    [
        {
        "seleniumProtocol":"Selenium",
        "browserName":"*firefox",
        "maxInstances":20},
        {
        "seleniumProtocol":"Selenium",
        "browserName":"*googlechrome",
        "maxInstances":20
        },
        {
        "seleniumProtocol":"Selenium",
        "browserName":"*iexplore",
        "maxInstances":10
        },
        {
        "seleniumProtocol":"WebDriver",
        "browserName":"firefox",
        "maxInstances":20
        },
        {
        "seleniumProtocol":"WebDriver",
        "browserName":"chrome",
        "maxInstances":20
        },
        {
        "seleniumProtocol":"WebDriver",
        "browserName":"internet explorer",
        "maxInstances":10
        }
    ],
    "configuration":
    {
        "port":5555,
        "host":"10.222.9.127",
        "hubHost":"127.0.0.1",
        "registerCycle":5000,
        "hub":"http://127.0.0.1:4444/grid/register",
        "url":"http://10.222.9.127:5555",
        "remoteHost":"http://10.222.9.127:5555",
        "register":true,
        "proxy":"org.openqa.grid.selenium.proxy.DefaultRemoteProxy",
        "firefoxProfileTemplate":"/home/slaouini/tools/selenium-server/web",
        "maxSession":20,
        "role":"node",
        "hubPort":4444
    }
}

Parallelism on the Client

If you are using testng to drive your tests this is most easily done using either the testng xml test suite file.
Essentially it allow you to run either parallel across cores and/or threads. Obviously the choice will be determined by your execution environment e.g. if running in a CI queue your job may only have a single core therefore you are better to go for multiple threading also when all we are doing is feeding some code to a remote server that is going to run the tests then having multiple threads may seem more sensible.

our testng file :


<suite name="SeleniumTests" parallel="tests" thread-count="10" verbose="10">
<test name="TEST01" >
<parameter name="webSite" value="http://10.222.4.109:8080/admin-console/"/>
<parameter name="seleniumHost" value="127.0.0.1"/>
<parameter name="seleniumPort" value="4444"/>
<parameter name="browser" value="*firefox"/>
<parameter name="timeout" value="10000"/>
<classes>
<class name="com.test.selenium.TEST01" />
</classes>
</test>
....
<test name="TESTN" >
<parameter name="webSite" value="http://10.222.4.109:8080/admin-console/"/>
<parameter name="seleniumHost" value="127.0.0.1"/>
<parameter name="seleniumPort" value="4444"/>
<parameter name="browser" value="*chrome"/>
<parameter name="timeout" value="10000"/>
<classes>
<class name="com.test.selenium.TESTN" />
</classes>
</test>
</suite>


Conclusions
So there you have it…I have barely scratched the surface of what I believe Grid 2 can offer, from listening to the presentation by the authors there is mouch more to discover.

That's all folks, take care :)

Git Cheatsheet


Drop that Sh*t ! Use Hudson

Holla folks
I was evolved in the development of a private cloud plateforme since a couple month and since that i implemented things that worked fine and still going ahead to achieve that goal, So I’ve decided to write this up for everyone :)

First, I have the burden of explaining why and how i use hudson in my project.
While it is important to get your build server building your software, it is even more important to get your build server to let people know when it can't do so.

In all cases, i needed a server that must let the right people know about any new issues, and it must be able to do so fast. This is what we call Notification.
I needed olso a cron like system but To be honest, I don’t think cron is even a “good enough” solution for most of today’s systems

So i decided to use this wonderfull tool called "hudson" to do some of the feature i needed for my project, my experience with hudson make me fall in love with it, Here’s why:

Among the myriad ways Hudson can measure success of a “build,” it can verify a zero return status from each “execute shell” build step.
If a job simply returns anything but zero, Hudson considers the build a failure and can notify you however you like.
It can email you (on first failure only or every time), you can subscribe to build feeds via RSS, or you can simply use the Hudson interface as a dashboard that shows failures in a convenient, summarized way.
Hudson logs the output of “execute shell” build steps.
Success or failure, Hudson archives the build output without filling your inbox or local disk.
If the console output isn’t enough, Hudson can archive per-run “build artifacts,” which are files on disk matching a defined pattern.
There’s also no-hassle “log rotation” by specifying a cap on the number of builds or a set number of days to keep results; this is configurable per-job.
If a particular run had output (say, for troubleshooting) you want to keep around, you can tell Hudson to “keep this build” indefinitely.
Hudson runs each build on “build executors,” which are effectively process slots.
Any system can have any number, but it puts a cap on how much Hudson tries to do, systemwide.
This mean 50 jobs can get scheduled to run every hour with four “build executors,” and Hudson will queue them all every hour and run four at once until they’ve all finished.
If a job is still running when the “periodic build” time comes around, Hudson can either run the job immediately (like cron) or queue the job to run when the one in progress finishes.
Hudson isn’t limited to time-based scheduling.
Sometimes, it’s useful to take a job that used to run periodically (say, a database refresh)
and make it only available for manual kickoff.
Of course, as a CI tool, Hudson can kick off jobs based on polling a version-control system.
For remote jobs, Hudson can sign onto systems with SSH, copy over its own runtime, and run whatever you’d like on the remote system.
This means that, no matter how many servers in a cluster need scheduled jobs, Hudson can schedule, run, and log them from one server.
Hudson can distribute the jobs dynamically based on which machines are already busy, or it can bind jobs to specific boxes.
Hudson has a solid web interface that can integrate with your Unix shadow file, LDAP, or other authentication methods.
For people who prefer operating from the command line, Hudson has a CLI.
Every job’s running time is logged. Hudson even provides estimates for how long it will take the system to get to any particular job when there’s a queue.

Voila The next pot shot will be more precise and i will share the architechture of the project as i implemented it and i will show how exactly hudson is used and how our private cloud is build using other great tools and frameworks ...

Peace out.

So many tools, So little time !

There are many choices to be made when starting a new project. I was wondering what i will choose to develop a web application needed to generate some artifacts. In the end I ended up choosing between :

Ruby + Rails
Python + Django
Scala + Lift
Java + Play

Despite the success of Rails in the past few years, it never really clicked with me. Granted I haven’t used it for more than a few weeks, but it never really felt natural. Some of my freinds witch are ruby developer, tried to convince me but no success!
Same with Django. I’ve used python on several occasions for scripting tasks since i'm using it to do some automation work in WebSphere Application Server, but never really for webapps. i think python is a great and powerful language, but this time i'm attracted to try some thing new.
I'm curious. and i wanted some thing new to satisfy my curiosity and get my brain busy for a while, sorry java no play for now :p

So I ended up with Scala and Lift. I looked briefly at Scala in 2007, liking what I saw, but didn't use it for anything serious since i was in some fucking trouble in that time. Any ways, In the end, what made the difference was:
Well known development and deployment platform, Java interop, both for use of existing libraries but also as a fallback plan if everything failed. Lift’s clean templating. We didn’t really need any of the advanced features like comet support in Lift, although the Ajax support looked nice.

So here we are, How has it been? 
In short: Not bad. I have a platform up and running :) When compared with Java code, my Scala/Lift code is not very verbose. I will not tell you what's the pro and the cons of my little experience with scala or lift ... it's not the aim of this blog post, but i'm really happy trying it out and really encourage you to do so :)
Overall, I liked the Lift framework and how it utilizes the Scala language. The fundamental approach to request handling seem very well thought out and makes it easy to handle both traditional web apps, Ajax, REST APIs etc.
The focus on Lift seems to get things done, not so much to create the perfect web framework abstraction that has all corner cases covered. This means most code has been battle tested, but sometimes you’ll wander along strange things will happen.

Here are some links to start programming in scala like a beast :p

Important Travel Rules and Tips


Hey dudes :)
Here are some rules for travel that everyone should know and I wish I had known when I started:

1- Learn Some Local Phrases- You don’t need to master the language but learning a few phrases will show some interest. bring a smile to a locals face, and get you a much friendlier response.

2- Sometimes it can be insulting if you leave money.

3- Don’t Claim to be an Expert- No matter how many times you’ve been to Paris, unless you’ve lived there for a long time, you are not an expert. You just know more than others.


4- Travel Alone- Traveling alone is something everyone should do once. You’ll learn more about yourself in than any other time in your life.

5- If not Alone Travel with your freinds, only crazy one, Girls are not welcome, they are boring.

6- Act like you know what you are doing and where you are going at all times, even if you don't.

7- Take Cash- Credit cards are not accepted everywhere and, in many parts of the world. Don’t get too tied to the plastic.

8- Always Visit Tourist Information Centers- These offices know all the information in the city, know what is going on, and usually have some discounts available. Don’t skip them.

9- Don’t Live by Your Guidebook- Take the information in guidebooks with a grain of salt.

10- Locals are Happy to Help- Don’t be afraid to ask strangers for help.

11- Learn about the places to avoid (from other travelers or friendly locals), Don't end up in the wrong place.

12- Respect has great importance, Educate yourself on the local dress codes, Religion and Customs. You may find yourself either offending the locals...or really turning them on. Watch up.

13- Don't be a Stupid fool !! (The most important one)

Remembering Gaza



A year ago to this date, a brutal Israeli offensive on Gaza took place. This tiny strip of land had been besieged for many months already and its people were already denied the very basic amenities of life. Yet, a year ago Israel launched a brutal and bloody attack on Gaza that killed at least 1400 people. This savage attack did not distinguish between civilians and militants, and people generally agree that the Israeli response to the termination of the cease fire was at least disproportionate.

The Israeli-given reason for this attack was to fend off the Hamas make-shift rockets that were fired onto southern Israel, which have escalated after the end of the cease-fire. However, this is hardly the case. The Hamas rockets are very primitive and make-shift. They are literally composed of gun powder in pipes that fly. Thus there was no infrastructure for Israel to take out. Plus, the Israeli attack did not stop the Hamas mortars even after it finished.

Unfortunately, it seemed that this attack was little more than a political move to help a candidate win an election. At the start of the offensive, the ruling Israeli party (Kadima) headed by Tzipi Livni were losing in the primary polls against pro-war hardliners and right-wingerssuch as Benjamin Netanyahu. This full-scale attack (without the interference of George Bush in has days in office) boosted Livni’s poll numbers and made the elections closer. Unfortunately, besides the large Palestinian death toll, this was the only outcome of this campaign on the Israeli side.

The Israeli attack didn’t distinguish between militants and civilians, and very little was done to minimize civilian casualties. Israel continued to justify that their mass bombing was because Hamas militants were hiding behind women and children. However, it seemed like Israel was using that as a ready excuse to justify anything (and little proof of that was provided in many cases). For example, Israel bombed a U.N. school that was temporarily housing refugee families which resulted in killing at least 40 people all of which were civilians. Israel also ignored international conventions and weapons that were banned internationally to be used against civilians. For example, white phosphorus was used in civilian locations which is banned internationally. To make matters worse, Israel prevented international news reporters from entering Gaza so as to limit the amount of information getting out.

Much has been said about this offensive. A U.N. fact-finding investigation of this conflict (what is sometimes called the “Goldstone Report”) concluded that both Hamas and Israel were to blame for the conflict, and concluded that Israel committed war crimes and and possible crimes against humanities.

Regardless of your political views you have to acknowledge that what Israel did in killing more than 1400 people, at least 1000 of whom were unarmed civilians, and at least 300 were children was extremely wrong and inhumane. Today, Israel is still denying basic amenities and even cement from entering Gaza to rebuild the destroyed homes.

On this day, it is worth giving a moment to think and reflect.