Earlier this week I watched Alex Chan's Sans I/O programming talk. In the talk, Alex argues the importance of separating I/O and program logic, with reference to a situation where his team was unable to use already-available parsing libraries for BagIt data because said libraries depended on local access to a file. This talk resonated with me because it made me realise I tend to couple I/O and logic in my code without realising it. I'd highly recommend watching the talk in full, or at least reading Alex's accompanying summary on their blog.
IndieWeb Utils, a community library that implements various IndieWeb standards and helper functions, asks users to pass a URL to do various tasks. One such example is to discover the representative h-card for a page. This function asks for a url and then makes a network request to get the HTML for that program. This means anyone who uses this function must make this network request. This is unfortunate when you consider this scenario: what if you have already retrieved the page in question?
I can easily imagine a scenario where someone has already made a network request to retrieve the resource. Maybe someone is writing a function that gets the title of a page and its representative h-card. If a network request has already been made, the IndieWeb Utils library will add another one on top for this particular situation. This isn't ideal. Why make two network requests when you can make one?
The solution for this is for IndieWeb Utils to separate I/O and the program logic. The representative h-card discovery function could take in a string of HTML instead of just a URL. Other functions, such as one used to generate context for a reply, could accept a Beautiful Soup 4 object, since IndieWeb Utils depends heavily on Beautiful Soup for HTML parsing.
These changes could be implemented in such a way that core functionality remains unchanged. For instance, if a raw URL is passed, the library could proceed as it does right now; if a HTML string is passed, that could be used in place of a network request body. And so on for different types of input and output.
I am new to this concept and am still trying to piece together how this is going to work. Writing this blog post is the beginning of my thinking more about I/O and logic. In my head, I see I/O handling could be processed by a parent function, then a separate function can handle the logic. The I/O handling function can accept general formats and pass them into the underlying format needed by the logic to do a certain task.
I'm not sure how long it will take to separate I/O and logic in IndieWeb Utils but this change is definitely worth it. Personally, this change will be useful. I suspect quite a few programs I have written that depend on IndieWeb Utils will get a bit faster if I am able to pass I/O objects that aren't just a URL whose contents IndieWeb Utils will retrieve. I will check back on this blog once I have worked on some of the changes necessary to separate I/O and logic.
Comment on this post
Respond to this post by sending a Webmention.
Have a comment? Email me at email@example.com.