• CSV header converters in Ruby

    The CSV library in the Ruby stdlib is a really great and easy to use one, and I’ve often used it for data migrations and imports. When importing data I often find it useful to validate the headers of the imported CSV, to ensure that valid columns are provided. Some users may provide columns in different cases to what you expect or with different punctuation (including spaces etc.). To normalize the headers when parsing a CSV, you can use an option passed to new (other methods such a parse, read, and foreach accept the same options) called header_converters. Here is a simple example of how you can convert the headers of the parsed CSV to lowercase:

    # Source CSV looks like:
    # First name,last Name,Email
    # Abraham,Lincoln,alincoln@gmail.com
    # George,Washington,gwashington@outlook.com
    downcase_converter = lambda { |header| header.downcase }
    parsed_csv = CSV.parse('/path/to/file.csv', headers: true, header_converters: downcase_converter)
    parsed_csv.each do |row|
      puts row['first name']
      # => Abraham
      # => George

    Simple as that. You can do anything to the headers here. There are also a couple of built in header converters (:downcase and :symbol) that can be used, and an array can be passed as an argument, not just one converter. Converters can also be used for cells in the CSV rows as well, not just headers. The documentation for the Ruby CSV class is quite clear and helpful, take a look to see all the other myriad options for reading and writing CSVs in Ruby.

    Originally, I found this solution and tweaked it a bit from this StackOverflow answer - https://stackoverflow.com/questions/48894679/converting-csv-headers-to-be-case-insensitive-in-ruby

  • Per-page background images using Prawn and Ruby

    Prawn is an excellent PDF generation library for ruby, and we use it for all our PDF needs at work. Their manual is some of the best documentation I have read. Recently, I needed to set a different background image on every page of a PDF I was generating. The prawn documentation, while good, only shows how to use a background image for the whole PDF:

    img = "some/image/path.jpg"
    Prawn::Document.generate(filename, background: img, margin: 100) do |pdf|
      pdf.text 'My report caption', size: 18, align: :right

    So, I decided to dig into their source code to see how they rendered the background image. After a short search I found what I needed. Turns out, this works for rendering multiple different background images! In prawn you can call pdf.start_new_page to start a new page, and on each new page I would call the following to set the new background for that page:

    background_image_path = 'some/path/for/this/page.jpg'
    pdf.canvas do
      pdf.image(background_image_path, scale: 1, at: pdf.bounds.top_left)

    I was able to generate the PDF with different background images perfectly with this code.

  • Prevent remote: true links opening in new tabs or windows in Rails

    In Rails, you can use the option remote: true on forms and links for Rails to automatically send AJAX requests when the form is submitted or the link is clicked. I plan to write a more in-depth article about this extremely useful feature in time, but essentially you just need to add an X.js.erb file in your views directory for your controller, where X is the action, and Rails will deliver this JS file as a response to the AJAX request and execute it. Now, most of the time you will not want these AJAX/JS-only routes to render a HTML view, but by default users can use middle click or open the remote: true link in a new tab, which will show a ActionView::MissingTemplate error because there is no X.html.erb file present.


  • ImageMagick unable to load module error on AWS Lambda

    Last Friday we started seeing an elevated error rate in our AWS Lambda function that converted single page PDFs into images using ImageMagick. We had been seeing the same error crop up randomly in around a two week period before Friday, but we were busy with other things and didn’t look too deeply into it. This was a mistake in retrospect. Below is the error in question:

    Error: Command failed: identify: unable to load module `/usr/lib64/ImageMagick-6.7.8/modules-Q16/coders/pdf.la': file not found @ error/module.c/OpenModule/1278.
    identify: no decode delegate for this image format `/tmp/BEQj9G8xj1.pdf' @ error/constitute.c/ReadImage/544.

    To figure out the dimensions of the PDF, to convert it to an image, and to optimize the size we were using the gm nodejs package. This is just a friendly wrapper around calling ImageMagick directly. ImageMagick version 6.8 is installed on AWS lambda base images by default. It took a while and a lot of googling and experimentation to figure out the error what the error was from. I found a StackOverflow question which was pivotal. It held vital information and pointed to a blog post on the AWS blog which talked about upcoming changes to the Lambda execution environment and a migration window. There was only one problem.

    We were at the very end of the migration window.

    Turns out Amazon likely removed a module referenced by pdf.la, which makes it so converting PDFs to images using ImageMagick no longer works on AWS Lambda. Now, the fix to this was essentially to use GhostScript instead to convert the PDFs to images, and then still use ImageMagick to resize the images. The steps I followed were (applicable to nodejs):

    1. Include the bin and share directories from https://github.com/sina-masnadi/lambda-ghostscript into our Lambda function, so we had a compiled version of GhostScript that worked on AWS Lambda.
    2. Change the JS code to call the GhostScript command to convert the PDF (sample below, command here)
    3. Upload the new code to lambda and make sure everything still worked (it did!)

    The answer on the StackOverflow question above is similar to the process I followed but I didn’t bother with lambda layers. Here is what our JS function to convert the PDF to image looks like:

    // tempFile is the path to the PDF to convert. make sure
    // your path to the ghostscript binary is set correctly!
    function gsPdfToImage(tempFile, metadata, next) {
      console.log('Converting to image using GS');
      console.log(execSync('./bin/gs -sDEVICE=jpeg -dTextAlphaBits=4 -r128 -o ' + tempFile.replace('.pdf', '.jpeg') + ' ' + tempFile).toString());
      next(null, tempFile.replace('.pdf', '.jpeg'), metadata);

    After I put the fix in place all the errors went away! Lesson learned for next time…pay more attention to the AWS blog! Here is our Lambda function success/error rate chart for last Friday (errors in red). It’s easy to see where the fix went live:

    imagemagick lambda errors

  • Rails Forms with Virtus and ActiveModel

    I absolutely HATED doing forms in Rails, until we came across this method of doing them at work. Our goal was to make forms simple to set up and to have clear logic and separation of concerns. We were using Reform at first, and although it worked well for simple one-to-one form-to-model relationships, it quickly fell apart with more complex model relationships were involved. As well as this, if there were complex validations or different logic paths when saving the forms, things quickly fell apart. And there was no way to control the internal data structure of the form. Enter Virtus and ActiveModel.


  • Subset Sum Problem in Ruby

    I came across a bizarre data storage decision in a recent data migration. For context, in Australia there is a kind of government demographic survey that must be reported to by certain organisations. One of the data points is “Qualifications Achieved” or something to that affect, which accepts a comma-separated list of values. For example, the qualifications and their values are similar to:

    524 - Certificate I
    521 - Certificate II
    514 - Certificate III
    410 - Advanced Diploma
    008 - Bachelor Degree

    If a person had achieved a Certificate III and a Bachelor, you would report 514,008 for that person to the government, for that data point. In the database in question there was a column which stored a single value. In this case it was 522, which is 514 + 008. So, if I wanted to break apart this number into its component parts to store it a bit more sensibly, I needed to figure out which of the source numbers added up to the target number.

    I’m sure any developer reading this has had a problem where they are sure there is an answer, but they just don’t know what to search for. After some Googling it turns out this is called the subset sum problem. And someone had thoughtfully made an implementation in ruby which I could use:


    Note that in my case I needed only one output set, which worked because all the number combinations in my source set of numbers provide a unique result. E.g. for the numbers above no combination except 514 + 008 adds up to 522. If you need it to this algorithm also returns multiple number sets that add up to the total.

    So, I took the algorithm, took my source numbers for each different data point, and my totals from the database, and it spat out the correct combinations! 1053 = 008 + 521 + 524. Aren’t algorithms magic sometimes?

  • Find duplicate rows in SQL

    Sometimes you need to find and count duplicate data rows in SQL. For example, in my use case I needed to find records in a table where there was more than one usage of the same email address. This would help me figure out how widespread and severe the duplicate issue was; the table in question should not have had duplicate rows based on that column in the first place! (A missing UNIQUE index was the culprit).

    SELECT email, COUNT(*)
    FROM user_accounts
    GROUP BY email
    HAVING COUNT(*) > 1;

    The HAVING clause is the important part of this query. To find duplicates, we need to check if any of the groups have a record count > 1. You can put other conditions for the groups in the HAVING clause as well if required, e.g. COUNT(*) > 1 AND account_status = 1.

    The result of this query can then be used for a sub query/WHERE clause. The result looks like:

    email              | count
    j.wayne@gmail.com  | 2
    g.cooper@gmail.com | 3

  • Global rescue_from error in Rails application_controller

    In our rails application, we needed a way to raise security access violations based on the user profile and bubble them all the way up to the application controller. We looked into it and found you can use rescue_from in your application controller, which allows you to specify an error class and a method inside the controller to call when that error is encountered. For example:

    class ApplicationController < ActionController::Base
      rescue_from Errors::SomeCustomErrorClass, with: :handle_error_method
      def handle_error_method(error)
        # do some error handling

    It’s probably not really a good idea to handle the normal ruby StandardError in this way, as that may get you into trouble, but it is perfect for custom errors raised deliberately from within your application! I really like this pattern of nesting an error definition class inside the class that is the one to raise that error. For example, in the result of a security check:

    class SecurityCheckResult
      class AuthorizationError < StandardError
      def run
        raise AuthorizationError(message) if check_invalid?

    Then in application controller I could just rescue_from SecurityCheckResult::AuthorizationError to catch this anywhere in my app, and do something like a redirect or a flash. If you need to use this pattern in regular ruby code you can include the ActiveSupport::Rescuable module. This article has a great example of using the module in regular ruby code (scroll down to the part that mentions RoboDomain).

  • Getting nodejs file permissions from fs.stat mode

    When you need to get file stats using NodeJS (which calls the unix stat command in the background), you can use the fs.stat call as shown below:

    fs.stat('path/to/file', function (err, stats) { });

    The stats object returned here is an instance of fs.Stats which contains a mode property. You can use this property to determine the unix file permissions for the file path provided. The only problem is that this mode property just gives you a number (as referenced in this GitHub issue). To view the permissions in the standard unix octal format (e.g. 0445, 0777 etc) you can use the following code:

    var unixFilePermissions = '0' + (stats.mode & parseInt('777', 8)).toString(8);

    Some examples of the mode before and after calling the above snippet:

    33188 -> 0644
    33261 -> 0755

  • field_with_errors changes page appearance in Rails

    I had a minor issue with my Rails view when I had a list of radio buttons wrapped in labels. When there are form errors on a field like a radio button, Rails puts the CSS class .field_with_errors on that field. This causes some issues with alignment as seen in the screenshot below:

    field with errors

    All you need to do to fix this is make the .field_with_errors class display inline like so:

    .field_with_errors { display: inline; }

1 // 10



Want to read regular updates? Subscribe via RSS!