Reverse Multidimensional Bin Packing and Accounting Analytics

This is an interesting problem area, and yes, it does have practical applications .. I can’t talk about the project I am using it on now, but a while back I used it on a problem involving data reconstruction in analysis of accounting records.

In that case the problem presented as follows… the person banking payments from clients was in the habit of “batching” them if payments on several invoices were received at once : he would simply bank and record 1 amount, and not record which invoices the payments were on behalf of. Fortunately he did not do this across clients – only one client’s payments would be aggregated in this manner.

His reasoning was that it did not matter if payments were recorded in this lump sum manner, as long as the outstanding balance for that client was recorded.

But of course for analysis of the accounting records (we were attempting to model cash flow, and the effect of payment policies on cash flow) it was a disaster and completely destroyed our 1 to 1 correspondence between invoices and payments.

To solve this, we had to develop something along the lines of a reverse bin packing algorithm.

I will explain the bin packing bit in a minute, but the idea is simple enough : we know the total $ for a given deposit, and we know what the amount was for each invoice prior to that date .. we wish to deduce which invoices were included in that $ amount.

Simple? No, it is not.

There are many ways that a sum can be made up from its parts ($1000= $900+$100, or $400+$200+$250+$150 etc) and the problem is further complicated by the fact that there may be several such aggregated transactions.

Excerpt from The Algorithm Design Manual: Bin packing arises in a variety of packaging and manufacturing problems.

Suppose that you are manufacturing widgets with parts cut from sheet metal, or pants with parts cut from cloth. To minimize cost and waste, we seek to lay out the parts so as to use as few fixed-size metal sheets or bolts of cloth as possible. Identifying which part goes on which sheet in which location is a bin-packing variant called the cutting stock problem.

After our widgets have been successfully manufactured, we will be faced with another bin packing problem, namely how best to fit the boxes into trucks to minimize the number of trucks needed to ship everything

With a bit of thought we can see how we can reverse the process.

Given a bin, and its total weight ($), we want to impute which items (and we know their weights, or $ amounts) are in that bin. We do this by treating the bins as having a capacity of their total weights, and then attempting to pack the original items into those bins, looking for a perfect fit.

Heuristics when we have non unique items

Of course the key to this is that the individual item weights must be more or less unique.. if we have items with the same weights, then they are interchangeable and the problem has no unique solution .. however additional heuristics can be brought to bear to suggest which solution is better (including human inspection of the solutions, perhaps requiring looking at the original paper records or some other such costly exercise – but at least it is confined to these few cases).

The problem gets more interesting when it gets multidimensional .. when we have multiple measurements (eg weight, volume, value) per item and per bin. It is still not easy to solve but as long as the measurements are exact (integers) and multidimensionally unique, it is feasible to find a perfect solution. I ended up using an exchange algorithm of sorts rather than transforming the problem into 1D bin packing … not quick, but effective – and hey, it’s only computer time and it’s a one off.

If you are interested in multidimensional bin packing have a look at http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/icpp/1999/0350/00/0350toc.xml&DOI=10.1109/ICPP.1999.797428
“Multi-Capacity Bin Packing Algorithms with Applications to Job Scheduling under Multiple Constraints”

So is the problem a .. what is the “isa” relationship?

It is interesting that it is not always easy to see what class of problem the thing you are dealing with “is” , what its “isa” relationship is (isa, pronounced “is a” is a term used in object oriented design, particularly in hierarchies .. eg a dog isa quadruped). In this case I stared at the problem for a while, scratched out some code and looked for some solution strategies, talked to a few people .. took me a few days before I realized that it isa (kinda) bin packing problem. Obvious in hindsight.

Applications of bin-packing? .. of course the number is huge but not always obvious. Because the examples are usually couched in engineering or operations-research terms (think machine shops, assembly lines, transportation etc) it makes it rather hard to think how the techniques might be applied to marketing or other data analytic tasks.

But, just occasionally, a nice application comes along and we say AHA! … and start broadening our horizons. (I wonder if there aren’t some applications to finance, portfolio optimisation maybe ..)

Leave a Comment