Search

Domino Upgrade

VersionSupport end
5.0
6.0
6.5
7.0
8.0
8.5
Upgrade to 9.x now!
(see the full Lotus lifcyle) To make your upgrade a success use the Upgrade Cheat Sheet.
Contemplating to replace Notes? You have to read this! (also available on Slideshare)

Languages

Other languages on request.

Twitter

Useful Tools

Get Firefox
Use OpenDNS
The support for Windows XP has come to an end . Time to consider an alternative to move on.

About Me

I am the "IBM Collaboration & Productivity Advisor" for IBM Asia Pacific. I'm based in Singapore.
Reach out to me via:
Follow notessensei on Twitter
(posts)
Skype
Sametime
IBM
Facebook
LinkedIn
XING
Amazon Store
Amazon Kindle
NotesSensei's Spreadshirt shop
profile for stwissel on Stack Exchange, a network of free, community-driven Q&A sites

03/10/2017

Excel in Continuous Integration

Category  

Business Users like Excel. Besides its original use case of calculating, lists are a favourite use case. They also serve as poor man's requirement and bug tracker, so they siep into software development too.

While Excel sheets are great for interaction, they are a beast for anything automation. The irony of it: Modern Excel files (the xslx flavour) are nothing less than zip files with XML content. However that format is, let's say, [insert expletives here].

From an XML representation I would expect something like <cell row="23" col="44">Some value</cell>. However that's not what Excel does. Rename an xlsx to zip and see for yourself. Also (which makes sense for Excel itself) empty cells are not represented in XML.

Cutting a long story short, an Excel file or its XML representation poses some challenges:

  • XML format is not very suitable for automation, like generating reports using XSLT
  • Excel automation only runs on Windows (and when you run headless, you trade the head for a headache). That makes it a no-go for most automation server environments
  • Empty cells are absent from the XML (a variation of "not suitable")
  • Cross reports with other files (e.g. logs in XML format) is hard

To overcome these limitations I wrote Excel2XML. It is a little Java command line utility that converts Excel into a more digestible XML format. I used Microsoft's contribution to the Apache POI project to read the file. It has the following functions:

  • Extract workbooks in one or separate files per worksheet
  • Ignore all formatting
  • Computed cells return their last result values, unless it is a formula error, then the formula is returned
  • The first line of each sheet is treated as column headers, which are extracted as columns/column elements
  • Each cell has a column, a row and a title attribute. The title reflects the value from the first row. This allows in XSLT to query the title instead of relying on the column number. Reordering, adding or removing columns won't kill your XSLT stylesheet that way
  • Optional empty cells can be generated with an attribute of empty="true"
  • Runs on Java8 completely from command line
  • Calling it without parameters outputs the exact syntax of options

The full syntax: java -jar excel2xml.jar -i somefile.xslt [-o somefile.xml [ ]-e] [-s] [-w3,4]

  • -i the input file in xslx format
  • -o the output file. If missing same name as input, but extension xml
  • -e generate empty cells. If missing: cells without data are skipped
  • -s generate a single file for the whole workbook. If missing: creates one file per sheet
  • -w comma separated list of sheets to export. Starts at 0. If missing: exports all sheets

Head over to the git repository and grab a release. Let me know what use you found. As usual YMMV

03/01/2017

Lessons from Project OrangeBox

Category   

Project OrangeBox, the Solr free search component, was launched after the experiments with Java8, Vert.x and RxJava in TPTSNBN concluded. With a certain promise we were working on a tight dead line and burned way more midnight oil than I would have wished for.

Anyway, I had the opportunity to work with great engineers and we shipped as promised. There are quite some lesson to be learned, here we go (in no specific order):

  • Co-locate
    The Verse team is spread over the globe: USA, Ireland, Belarus, China, Singapore and The Philippines. While this allows for 24x7 development, it also poses a substantial communications overhead. We made the largest jumps in both features and quality during and after co-location periods. So any sizable project needs to start and be interluded with co-location time. Team velocity will greatly benefit
  • No holy cows
    For VoP we slaughtered the "Verse is Solr" cow. That saved the Domino installed base a lot of investments in time and resources. Each project has its "holy cows": Interfaces, tool sets, "invaluable, immutable code", development pattern, processes. You have to be ready to challenge them by keeping a razor sharp focus on customer success. Watch out for Prima donnas (see next item)
  • No Prima Donnas
    As software engineers we are very prone to perceive our view of the world as the (only) correct one. After all we create some of it. In a team setting that's deadly. Self reflection and empathy are as critical to the success as technical skills and perseverance.
    Robert Sutton, one of my favourite Harvard authors, expresses that a little bolder.
    In short: A team can only be bigger than the sum of its members, when the individuals see themselves as members and are not hovering above it
  • Unit test are overrated
    I hear howling, read on. Like "A journey of a thousand miles begins with a single step" you can say: "Great software starts with a Unit Test". Begins, not: "Great software consists of Unit Tests". A great journey that only has steps ends tragically in death by starvation, thirst or evil events.
    Same applies to your test regime: You start with Unit tests, write code, pass it on to the next level of tests (module, integration, UI) etc. So unit tests are a "conditio sine qua non" in your test regime, but in no way sufficient
  • Test pyramid and good test data
    Starting with unit tests (we used JUnit and EasyMock), you move up to module tests. There, still written in JUnit, you check the correctness of higher combinations. Then you have API test for your REST API. Here we used Postman and its node.js integration Newman.
    Finally you need to test end-to-end including the UI. For that Selenium rules supreme. Why not e.g. PhantomJS? Selenium drives real browsers, so you can (automate) test against all rendering engines, which, as a fact of the matter, behave unsurprisingly different.
    One super critical insight: You need a good set of diverse test data, both expected and unexpected inputs in conjunction with the expected outputs. A good set of fringe data makes sure you catch challenges and border conditions early.
    Last not least: Have performance tests from the very beginning. We used both Rational Performance Tester (RPT) and Apache JMeter. RPT gives you a head start in creating tests, while JMeter's XML file based test cases were easier to share and manipulate. When you are short of test infrastructure (quite often the client running tests is the limiting factor) you can offload JMeter tests to Blazemeter or Flood.io
  • Measure, measure, measure
    You need to know where your code is spending its time in. We employed a number of tools to get good metrics. You want to look at averages, min, max and standard deviations of your calls. David even wrote a specific plugin to see the native calls (note open, design note open) or Java code would produce (This will result in future Java API improvements). The two main tools (besides watching the network tab in the browser) were New Relic with deep instrumentation into our Domino server's JVM and JAMon collecting live statistics (which you can query on the Domino console using show stats vop. Making measurements a default practise during code development makes your life much easier later on
  • No Work without ticket
    That might be the hardest part to implement. Any code item needs to be related to a ticket. For the search component we used Github Enterprise, pimped up with Zenhub.
    A very typical flow is: someone (analyst, scrum master, offering manager, project architect, etc.) "owns" the ticket system and tickets flow down. Sounds awfully like waterfall (and it is). Breaking free from this and turn to "the tickets are created by the developers and are the actual standup" greatly improves team velocity. This doesn't preclude creation of tickets by others, to fill a backlog or create and extend user stories. Look for the middle ground.
    We managed to get Github tickets to work with Eclipse which made it easy to create tickets on the fly. Once you are there you can visualize progress using Burn charts
  • Agile
    "Standup meeting every morning 9:30, no exception" - isn't agile. That's process strangling velocity. Spend some time to rediscover the heart of Agile and implement that.
    Typical traps to avoid:
    • use ticket (closings) as (sole) metric. It only discourages the us of the ticket system as ongoing documentation
    • insist on process over collaboration. A "standup meeting" could be just a Slack channel for most of the time. No need to spend time every day in a call or meeting, especially when the team is large
    • Code is final - it's not. Refactoring is part of the package - including refactoring the various tests
    • Isolate teams. If there isn't a lively exchange of progress, you end up with silo code. Requires mutual team respect
    • Track "percent complete". This lives on the fallacy of 100% being a fixed value. Track units of work left to do (and expect that to eventually rise during the project)
    • One way flow. If the people actually writing code can't participate in shaping user stories or create tickets, you have waterfall in disguise
    • Narrow user definitions and stories: I always cringe at the Scrum template for user stories: "As a ... I want ... because/in order to ...". There are two fallacies: first it presumes a linear, single actor flow, secondly it only describes what happens if it works. While it's a good start, adopting more complete use cases (the big brother of user stories) helps to keep the stories consistent. Go learn about Writing Effective Use Cases. The agile twist: A use case doesn't have to be complete to get started. Adjust and complete it as it evolves. Another little trap: The "users" in the user stories need to include: infrastructure managers, db admins, code maintainer, software testers etc. Basically anybody touching the app, not just final (business) users
    • No code reviews: looking at each other's code increases coherence in code style and accellerates bug squashing. Don't fall for the trap: productivity drops by 50% if 2 people stare at one screen - just the opposite happens
  • Big screens
    While co-located we squatted in booked conference rooms with whiteboard, postit walls and projectors. Some of the most efficient working hours were two or three pairs of eyes walking through code, both in source and debug mode. During quiet time (developers need ample of that. The Bose solution isn't enough), 27" or more inches of screen real estate boost productivity. At my home office I run a dual screen setup with more than one machine running (However, I have to admit: some of the code was written perched into a cattle class seat travelling between Singapore and the US)
  • Automate
    We used both Jenkins and Travis as our automation platform. The project used Maven to keep the project together. While Maven is a harsh mistress spending time to provide all automation targets proved invaluable.
    You have to configure your test regime carefully. Unit test should not only run on the CI environment, but on a developers workstation - for the code (s)he touches. A full integration test for VoP on the other hand, runs for a couple of hours. That's the task better left to the CI environment. Our Maven tasks included generating the (internal) website and the JavaDoc.
    Lesson learned: setting up a full CI environment is quite a task. Getting the repeatable datasets in place (especially when you have time sensitive tests like "provide emails from the last hour") can be tricky. Lesson 2: you will need more compute than expected, plan for parallel testing
  • Ownership
    David owned performance, Michael the build process, Raj the Query Parser, Christopher the test automation and myself the query strategy and core classes. It didn't mean: being the (l)only coder, but feeling responsible and taking the lead in the specific module. With the sense of ownership at the code level, we experienced a number of refactoring exercises, to the benefit of the result, that would never have happened if we followed Code Monkey style an analyst's or architect's blueprint.
As usual YMMV

Disclaimer

This site is in no way affiliated, endorsed, sanctioned, supported, nor enlightened by Lotus Software nor IBM Corporation. I may be an employee, but the opinions, theories, facts, etc. presented here are my own and are in now way given in any official capacity. In short, these are my words and this is my site, not IBM's - and don't even begin to think otherwise. (Disclaimer shamelessly plugged from Rocky Oliver)
© 2003 - 2017 Stephan H. Wissel - some rights reserved as listed here: Creative Commons License
Unless otherwise labeled by its originating author, the content found on this site is made available under the terms of an Attribution/NonCommercial/ShareAlike Creative Commons License, with the exception that no rights are granted -- since they are not mine to grant -- in any logo, graphic design, trademarks or trade names of any type. Code samples and code downloads on this site are, unless otherwise labeled, made available under an Apache 2.0 license. Other license models are available on written request and written confirmation.