Get Rid of Duplicate Contacts in Marketo

We see this often with clients using Marketo — while it is a great marketing automation solution, chances are that you have a lot of duplicate contact records in your Marketo customer data set.

Here is what happens — you load an initial customer list into Marketo, and over time as you add new customer records to that list, Marketo may fail to check for duplicate email addresses, and instead of records getting updated or merged, you get multiple records with same email address.

And this is costly — if you are paying by the number of contacts managed within Marketo, you pay multiple times for the same person — and you also incur the infamous customer wrath of receiving multiple campaign emails from you.

How do you fix this? Hire us, of course :)

But seriously, we have done this enough times for customers, and now we are contemplating providing this as an automated service for Marketo customers. Our simple fix process is as follows:

  • Working with our partner company Empathy Logic, we maintain a transient customer data store for data consolidation and cleansing — which includes de-duplication and householding
  • We have connectors built to Marketo’s platform so upon your security authorization, we can pull in all your customer data from Marketo to our platform and run our de-duplication routines to flag all the duplicate records.
  • We then send the list of fixed records to Marketo platform so all the duplicates are removed from your customer data set

The time to apply this fix depends on the size of your customer data set, and speed of Marketo’s record update process – we find that it is faster to update the entire customer data set rather than making batch fixes.

We are also curious to learn if this has happened to you. and what solution approaches you have found to be effective. Would love to hear from Marketo folks as well as to what are some best practices to avoid duplicates in the first place.

NuTreeSource by Team Almond Brothers: 3rd Place award @ #AppsForAg Hackathon

“Hey guys — I’m not a techie, but I keep track of all of my farm visits and application of fertilizers and pesticides using OneNotes .. let me show you on my iPhone .. “, with that Justin Butra had us hooked on working with him at the first ever AppsForAg AgTech hackathon (in the world?) hosted at West Hills Community College in Coalinga, CA.

Justin is what they called a “grower” at this event. His life is immersed in agriculture. He advises hundreds of farm owners about managing nutrients and fertilizer, about pesticides, about irrigation. Each hackathon team needed to have a grower. We were lucky to have Justin (actually 2 Justin’s – Mr Dutra the “grower” and Mr Bowen the “hacker extraordinaire”)


He also managed scope creep: “of course we could do a lot of things – but let’s focus on 1 crop (Almond), and 1 nutrient (Nitrogen, the most important, and also most difficult to manage well)”

And we gave our team a name — The Almond Brothers!

Our app NuTreeSource would do smart budgeting and tracking of how much nitrogen to apply to Almond crops to maximize yield (and reduce groundwater consumption – every bit counts in these days of drought in California)

And we would do it all within a span of 24 hours. This is what it looked like:




Team Almond Brothers hard at work



Justin Dutra (Grower) and Sandeep Giri (representing Developer) posing in front of their masterpiece – whiteboard specs and mockups for NuTreeSource app

And on Sunday 11:30 – we submitted our app and our presentation. There were 6 other great teams along with us who presented. They were all great, and we learned so much from each one of them, almost in awe of how well everyone knew the space, and their innovative approaches to each problem.

We are happy to report that we received the third place award. And more importantly, we received valuable feedback on how to improve our app, which we certainly intend to.

Team Almond Brothers receiving their 3rd place award for NuTreeSource

Above all, what a great experience – to go in on a Friday evening (after 4.5 hours of driving through heavy traffic from San Francisco), and by Sunday, come up with not only application ideas vetted by a (committed) user group of growers, but also a working prototype that we can improve upon.

The smiles in this group picture tell it all. Thanks everyone!



Meet Us at Apps for Ag Hackathon

Lately we have been fascinated with the potential of Internet of Things (IoT) in agriculture sector. So, we couldn’t really stay away from the Apps for Ag Hackathon in Coalinga, CA (apparently the Harris Ranch, place where all attendees are staying, has the best steaks in the planet, but I digress :)


Anyway, the suggested projects are pretty fun –

Come by and meet our team – Tom, Justin, and I – and if you want to join our hackathon team, even better. We will keep you posted. Cheers!

OpenI 3.0.1 is here (Pentaho Plugin for OLAP Data Visualization)

Dear OpenI Community,

We are happy to announce the release of OpenI 3.0.1 as a plug-in for Pentaho CE. With this release, we will focus more on developing new features on top of Pentaho (and perhaps Jasper) instead of OpenI being a BI Server in itself.

The OpenI plugin for Pentaho provides a simple and clean user interface to visualize data in OLAP cubes. It supports both direct Mondrian and xmla based connections like Microsoft SQL Server Analysis Services (SSAS), plus provides OpenI specific features like:

  • Explore OLAP cube data – point to a cube, choose a metric, and instantly chart the metric against all available dimensions
  • Write custom SQL for drillthrough to return more detailed data sets (instead of just identifiers)
  • Post drillthrough results directly to an external web service (instead of always having to download to local machine)
  • and many more..

A demo video is available at –

You can download it at –

Documentation is at –

We would love to hear your feedback.


Sandeep Giri
Project Lead, OpenI.Org

Replace Jaspersoft OLAP UI with OpenI Plug-In

I am happy to announce some fun and exciting changes at OpenI, and the alpha release of our team’s work over last couple of months.

First – OpenI as a “BI platform” on its own will discontinue. Instead, we will use Pentaho and Jasper as our base platforms as needed. The approach is to release “OpenI plug-ins” for Jasper and Pentaho — so that they behave/appear in a manner very similar to OpenI.

Our first step in this direction is our “Plug-In” for Jasper that replaces its JasperSoft OLAP (aka JasperAnalysis) UI, which is basically JPivot as-is, with OpenI’s UI. We believe this is a greatly improved UI for JasperServer users who need better interface for OLAP reporting and exploratory analysis . Here is a screenshot:

OpenI Plug-In Replaces JasperAnalysis UI

OpenI Plug-In Replaces JasperAnalysis UI

You can also check out a live demo here: (Use the demo user account “user/user” or “demo/demo” to get logged into the system)

Currently this is an alpha release. You can download binaries or source code here –

Let us know what you think. If you need help, etc. – use the support forum.

The general thinking is – if you deploy Pentaho or Jasper with OpenI plug-in, you will get the lightness, easy-to-use look-and-feel, plus features that are unique to OpenI which are not available in Jasper or Pentaho (such as exploring cube data). We may also have to create our own installers that makes their installation/deployment process easier.

So – that’s the general direction.

Although Jasper was an easy pick for this first release, we are looking for similar approaches with Pentaho. Now, Pentaho already has efforts underway to replace JPivot UI (2 different approaches though – their open source version has Pentaho Analysis Tool (going through a rewrite at the moment), and proprietary version Pentaho Analyzer which is pretty decent) — so not sure, how much value-add will it be to put OpenI’s UI as a Pentaho plug-in, but there might be other features that may be better suited for an OpenI plug-in for Pentaho.

Pentaho is pretty impressive because they take a platform approach, not just a reporting server. So even though each individual component may not be as fully developed as it needs to be – architecturally I think they have a sound approach. The work we will do with Pentaho will be more along the lines of how can we make it easier to use – whether that will be via plug-ins, or embedding Pentaho in our own build, we will try that out and see what makes sense.

Stay tuned, and of course, your feedback is much appreciated, as always.


In Commercial Open Source, Partners = Community

What wowed me most at the Pentaho Global Partner Submit (#PGPS in Twitter) was, well.. the partners, or the partner program, I should say.

Back in the day, when we submitted our open source projects to sourceforge – we would sit there checking the number of downloads almost every minute, obsessing over the daily downloads and sourceforge ranking. A lot of this was driven by (besides the desire to be famous and get geek cred) the belief that more downloads = larger community = more “contribution” from the community.

This did not turn out to be true.

And not just for us, but for most open source enterprise applications out there, number of downloads has nothing to do with community participation. For that, you have to go beyond the realms of sourceforge forums and tracker – and actually actively build a community.

How do you do that? Well, after spending 2 days at PGPS, I am deeply impressed how great of a job Pentaho has done in building a thriving community via their partnership program. What we expected back in the day from our downloaders, Pentaho is exactly getting that from their partners. Partners are writing new features (e.g. Community Dashboard Framework, integration to CMS, single sign on, etc etc), they are fixing major bugs, they are writing books, they are even actively participating in shaping the roadmap. Simply amazing!

Whether this is a phenomenon unique to “commercial open source” – we don’t know, but look – for almost every enterprise open source project, at some point, reality kicks in, and we have to worry about monetization – so many of us become “commercial”. Of course, that immediately brings a tension between the users of the “free” version versus the “premium” version – as in, “why did you put feature ABC only in premium, and not in the free one, you greedy capitalistic pig?”

Well, one answer – “free” doesn’t pay the bills, “premium” does. “Free” is also probably justifiable if there was some contribution, but as we have seen – most of the people download and use open source for free, but they don’t contribute anything. Over time, this becomes unbearably taxing for the core developers of the open source project.

But amazingly, you can get more contributors to your “premium” version (and “free” version as well by extension) if you build a great partnership program around it like Pentaho has done. This is because not only the “premium” version pays Pentaho’s rent, it also helps their partners to pay their rent as well.

So yes, this isn’t the good old open source where it was all about freedom, peace, and love. This one is definitely about the money, but the twist is — it does share the wealth AND the open source bit makes it much easier for partner to participate and contribute. And in doing so, it brings back the extremely sought-after “community contribution” back in the game, which is the life/death factor for any open source project.

So Pentaho  – hats off to you guys for showing how to build a thriving community around commercial open source via a great partner program. Don’t ever go to the dark side :)

Forrester Wave Categorizes OpenI as “Reporting Tool” – You Agree?

The Forrester Wave™: Open Source Business Intelligence (BI), Q3 201” report by Boris Evelson came out today (Aug 10, 2010), categorizing OpenI as a “Reporting Tool”

While we feel honored that Forrester took notice (last year they had pretty much called us “dead”), it is interesting to see that they see the Open Source BI land fragmented into different specialty components – “data integration” tools (which we’re guessing is the same as ETL), “data reporting” tools,  “advanced analytics” tools, and “geospatial analytics” tools. Only open source projects that qualify as comprehensive “BI Suite” are BEE, Jaspersoft, Pentaho, and SpagoBI.

And then further on, they say that — according to “Forrester’s 157-criteria evaluation of open source BI vendors, we found that Actuate BIRT led the pack because of richness of reporting functionality. Jaspersoft Enterprise, SpagoBI, Pentaho Enterprise, and Pentaho Community are close behind and also offer much fuller and broader BI stack than Actuate BIRT, including extract, transform, and load (ETL) and advanced analytics functionality.”

We’d love to see their what their 157 criteria are – but the full report costs US$1,749 – probably one of the few scenarios where a report on enterprise software costs more than most of the individual software licenses, but that’s open source for you.

So, what does it all mean? Well, for open source projects like OpenI, the Forrester Wave is a good place to get mentioned, because it will attract some new people to at least look at our software.  We’d beg to differ and state that we are more than just a “reporting tool”, but at the end of the day, that’s mostly just semantics. It probably benefits Actuate BIRT the most since it gets raving reviews over Pentaho and JasperSoft because of “richness of reporting functionality” (eye candy?), but Pentaho and JasperSoft can take a bit of comfort for being described as having “much fuller and broader BI stack than Actuate BIRT”.

However, a casual customer who is looking for a decent open source BI solution could care less for all this, because what they’d like to know is who can meet their requirements with the least amount of effort/cost and highest amount of reliability. Perhaps reports like these should also consider factors such as ease of adoption, TCO, support, license friendliness, and if there are any vertical solution packs offered on the open source stack, whether paid versions or not. One of the key strengths of any open source project is the community behind it, and what type of ecosystem it has been able to create where people in the community are building new solutions (plug-ins, extension components, etc.) and are really involved in advancing the platform as opposed to having all new development just coming from the open source company’s internal development team.

Ultimately, it’s not just Forrester report’s responsibility, but the onus is also on open source project/companies to make this transparent so that newcomers more or less know what they are getting into before investing a lot of time/energy, else we run the danger of having so much “markitechture”  in our home pages that IT organizations have no other option other than to read Forrester Wave to figure out their open source BI strategy.

OpenI Differentiators

We received some decent feedback in our discussion thread on OpenI’s future roadmap. Here’s one from “noblomov” that describes how OpenI is different from other open source BI tools and where we should focus next (we couldn’t have said it any better – so thanks!)

Hi Sandeep,
thanks for sharing with us what should be the future of Openi and giving us the opportunity to tell you and the Openi dev team what we like about Openi compared to other open source products out there, and what featuers we’d like to see in future versions.
For me there are various very interesting points about Openi compared to Pentaho :
  • Openi is pretty easy to install, where Pentaho isn’t as straightforward to my mind
  • Openi is very simple and “basic” : create reports and see them through Dashboards, where Pentaho is trying to do more things and as a result isn’t as easy to use for the end users
  • Openi offers a real BI SaaS platform, allowing several clients (different departments of the same enterprise, but also different companies) to connect to the same infrastructure, where Pentaho is a dedicated solution. This is for me the main advantage of Openi over other Open Source BI Solutions, and this is a big one.
The features I’d like to see on Openi 3.0 would mainly be :
  • allow finer control of users rights on a project. Today there are 3 users type : application admin, project admin, project user. It would be great to have optionnal settings on project admin for example, giving this profil the rights to create a limited number of accounts for its project. So an optional “accounts quota” setting would be nice.
  • As I see Openi as a great SaaS BI solution, it would be great to allow complete separation of different projects databases. Today to my knowledge the Projects in OpenI use different tables, but in the same database (same MySQL database for example). I would like to be able to define separate database for different projects, and then permit a total separation of projects datas (each project could have its own MySQL database). That would be a real plus in terms of scalability and security.
That’s my 2 cents for OpenI, which is a great BI tool.
By the way, is there any plan for a General Availability version for OpenI 2.0 ?

OpenI’s Future as a BI Platform vs a BI Application

A great question came up on OpenI forum from Andre, which I feel is important to share with all of you:

What new features that are planned for the Open? There is a forecast for the next version? What is the main advantage of the Openi on the Pentaho?

What new features that are planned for the Open? There is a forecast for the next version? What is the main advantage of the Openi on the Pentaho?



To which, my response is:

Hi Andre

Your message comes at an interesting and exciting time for us. You saw that most of 2009, we focused on tightening up the 2.0 release, which now is stable and we’ve gotten good feedback on. Now in 2010, we will continue with point releases on 2.0 with bug fixes and enhancements, and we’re also in midst of planning the road map for OpenI 3.0 and beyond.

Basically the big question for us is — is OpenI a BI platform, or more of a BI application? OpenI started back in 2005, right around the same time Pentaho and JasperSoft launched. While Pentaho, Jaspersoft, et al have done a great job in building out a robust BI platform, OpenI’s differentiator is that it strives to be BI application that a user can use right “out of the box” as opposed to an “SDK” on top of which a BI developer will build their BI application. Hence a lot of our work has gone towards making the installation increasingly easier, being able to just point to an OLAP data source and start publishing anlayses/dashboards without having to write code, supporting Microsoft Analysis services, etc.

However, all this requires a BI platform underneath, and to date, OpenI has built its own platform using the same “usual suspect” components (JPivot, Mondrian, etc.) that most other open source BI projects use. And now we’re asking ourselves if that isn’t re-inventing the wheel. Why take upon the development and maintenance of a BI platform (although using a lot of open source components) — when you can probably use an existing open source BI platform and focus more on your differentiators.

So the most likely outcome for 3.0 road map will be that we’ll use a comparable open source BI platform where we can not only migrate all of our key features of OpenI 2.0 and start focusing more on usability-related features. Sorry to be vague/high-level, but we will have a more elaborate design/roadmap published on our website soon that’ll describe these features and solicit your feedback.

Which means — a big part of all this is where our community will like to see OpenI go. So, your feedback, feature requests, or just general design guidelines are very important to us as we plan the road map for 2010

Thanks for the nudge on this very important issue, now we’ll have to work harder to publish our road map and clear up things for everyone :)


Project Lead, OpenI.Org