A part of this article covers the issues related to C/C++/Java programming languages, and mentions relevant APIs. Feel free to skip those if you’re not familiar with the languages.
This is a sixth article in the SDK Design Goal series. Please see the introduction article “How to present the licensed technology the right way?”.
When you developed your product, it only needed to work with files. After all, the users only needed to open the image file – or a set of files – and recognizes characters on it. Or to scan them for malware. Or to encode them into a video. Thus your technology operates on files. When you decide to make it available for licensing, this was the only option available to you.
Now you face a dilemma. Shall you take the technology as-is and make it available with its limited functionality? Or shall you spend time and effort, adding more functionality – and potentially more bugs – and delay the time to market? You speak with engineering, and they believe using only the files is good enough. You ask, what if someone wants to handle the data stored in memory? Or in the database blob? They say, this is not likely to happen, but even in this case it shouldn’t be difficult to dump this file into disk for processing. Just four lines of code or so. No big deal, right?
No, unfortunately this is not right, for the following reasons:
- Since you require your licensees to do extra work to use your SDK in their scenario, this extra work will be billed/labeled as part of the SDK integration. You might think this is not fair – after all, the code they need to write is simply because they do not have the data in the format which “everyone uses”. Thus converting it to the “everyone uses” format shall be considered fixing a bug in their application, not SDK integration effort. However since they’re completely fine with the data format they use, they will label this effort as required for your SDK. As a result, using the competitive SDK may result in quicker integration, even despite requiring the SDK-specific work!
- Extra work will increase a chance they will introduce more bugs, and since this is the part of code written explicitly to use your SDK, your engineers would eventually be involved in debugging those, costing both of you time and money.
- Converting the data from one format to another (such as dumping it from memory to disk) may introduce technical and legal issues. For example, if their application crashes, its memory is freed automatically – but the files on disk will not be automatically deleted. Thus yet more extra effort would be needed to ensure the application doesn’t pollute its environment even in case of errors. Also the data in this case may be the 3rd party data, and the licensee might not be legally allowed to dump it on disk at all.
- Converting the data from one format to another (such as dumping it from memory to disk) will introduce performance penalty. In some cases it may be negligent – if your SDK takes ten minutes to process the image, time required to dump it to disk is not a major concern. However if your SDK processes data quickly, any extra step required to prepare this data would become a performance bottleneck.
Thus I suggest for the SDK designers that even the initial release of your technology shall already have all the functionality which most licensees would need. Yes, this will take extra time and effort, and yes, this will mean duplicating some functionality. For example, you want the licensee to be able to provide the data to your SDK:
- As a file, using the file name. This is very easy to use in prototyping;
- As a file handle (using the operating system-defined file handle such as one returned by open() on Linux/Mac or returned by CreateFile() on Windows, or File objects on Java). Besides reusing an existing handle – your licensee might have already opened this file – a less obvious benefit is direct access to errors if opening the file fails. This is much better than getting an error such as “CANT_OPEN_FILE” from your SDK and trying to find out why exactly it couldn’t open the file?
- File, using the native libraries file handle (such as stdio FILE in C, fstream in C++, FileInputStream in Java). This is not obvious, but still might be very useful if your licensee uses those in their code, or get those objects from another library. Then this would help them to integrate quicker, as they could pass those objects directly to your SDK.
- File as a stream (such as istream in C++ or InputStream in Java). This allows the licensee to pass you any stream-like objects, including those which are not files – for example, HTTPConnection.getInputStream() – directly to your SDK. Again, the goal here is to simplify the integration, so the licensee would need to write less code.
- In-process memory in a continuous buffer. This is typical scenario, for example, in gateway-like environments, when the passing content rarely hits the disk. It also offers good performance, and could also be used for memory-mapped files.
- IStream-like interface, where your SDK can ask for the piece of data, and the licensee would provide it. This will allow easy integration when the content is stored in a non-continuous memory buffer (such as HTTP chunks) or in a non-trivial storage such as a database blob.
Of course, depending on how your SDK uses the content, some of those scenarios might make no sense. If you need to read every file passed to your SDK completely, at once, (no seek-read involved), and don’t need to know its size in advance, there is no practical difference in providing different interfaces in Java for File, FileInputStream and InputStream – InputStream alone would suffice. However if your SDK can benefit from being able to seek and/or reread parts of file, providing different interfaces would make more sense. You need to provide what would be reasonably needed, but only this is needed – not everything possible.
Another important point is that different functions mentioned above might have different effect on performance; for example, passing the file handle/File object may be much more effective than passing the stream. Those differences are not likely to be obvious to your licensees at all. Make sure you describe them in the associated documentation, so the licensee would know the tradeoffs, and choose the right function.
Finally, I would not recommend providing only the most flexible approach (such as IStream-like interface), which some engineers advocate for. Indeed it is true that such function alone could be used to implement any of the functions listed above. However it is much more difficult to use, requires writing significantly more code and more prone to bugs.