About

This is where I put all the things I've created which I hope may somehow be useful to others.


Statement

I believe in modular, purpose-built software. The more freely a program or library can be adapted and joined to other software the better. This of course begs the question of interoperability, documentation and cooperation amongst developers, and is why I am a sharp proponent of free software. I cannot stress how important it is, not simply just for myself (in that regard I find it to be an unparalleled educational tool), but rather for humanity as a whole. It is the actualization of creating software without constraints of intellectual ownership/micromanagement.

I have a strong interest in distributed software systems, especially favoring self-healing networks and distributed data structures and associated techniques. Beyond that I prefer to create purpose-built, practical use applications in a variety of languages. Practical in that their sole use isn't to pad a resume, but rather be of some use to others now and in the future beyond direct use even, either for design ideas or coding practices. Being able to riddle off a list of intangibles for the sole purpose of creating a list reeks of convenience, if nothing else. Above all else, I would hope what I have created can be of some use to people, even if only pedagogical.

In retrospect, I have since come to understand that I am a very design-driven developer, perhaps even over designing at times. A vast majority of my effort is prepaid out beforehand designing multiple approaches and ideas. Generally this pays off since perfect information about the task at hand rarely exists (so much for SCRUM), and I'd rather fall back on a secondary design than have to re-design again on the fly. Over-designing also helps the implementation in terms of knowing what you'll expect out of certain parts of the program. Despite modern languages offering an over-abundance of higher-order constructs, I honestly don't believe in their liberal use, ie. Php's magic methods. Most of the highly extrapolated capabilities of modern languages are more reliably implemented by means of lower level controls. Illustratively, one could implement an alternative, and more highly-tailored form of reflection in any language that doesn't support it by means of status accessor methods. Judgment calls will inescapably need to be made about any design decision and in some cases these constructs can be indispensable, however a rush to be highly reflexive/dynamic isn't always the best approach, as readability can suffer.



apache2 (2.4.57)
Mar 08, 2024

How i found out about the bug fix for lua.

Keywords: Apache2, Upload Limit, Request Limit, 1 GB, Lua, idealism.

I have been using apache since forever. So I tend to take things the wrong way when it gets mucked-with.

So, on a recent ugrade bender while changing some decade-old systems from the 3. kernel to something recent, and leaving ubunster to come back to debian all over again, I realized something odd.
There was definatley something blocking me from POSTing more than 1024MB files or even multiple smaller ones adding up to over that limit! This was to controlled code, code which I had worked out all kinks time and time again, and I knew exactly what needed to be adjusted in php.ini. No this was elsewhere, it wasn't even getting to php, and was in some browsers manifesting as a "413 request entity too large".

The usual gamut of sites and know-it-alls proved pretty much useless. Stuck in the past they were, perpetually hashing out the memory_limit and post_max_size existentialisms. Quickly I started the process of elimination, taking into account working systems with almost-as-new setups. It must, I deduced, be something 'Changed' in the last year or two and definately apache based. A flashback quickly befel me. Ahh right! Just like when PHP 8 decided (itself) to implicitly set the default timezone to UTC (and not use the system time)! It was a subtle but not TOO difficult change to notice. This one must be similarly situated. A long-standing apache2.conf setting that's never needed to be used has now had its default setting altered to 1 GiB, implicitly. But why?

🎡🌴🗿 CVE-2022-29404 🗿🌴🎡

So IF you're using LUA on the backend via Apache mod_lua and a malicious script...I'll let the official explanation stand:
In Apache HTTP Server 2.4.53 and earlier, a malicious request to a lua script that calls r:parsebody(0) may cause a denial of service due to no default limit on possible input size.

So to recap, to fix the issue, which can of course be mitigated by not using mod_lua as stated elsewhere or possibly implementing a script check manually which would parse potential user-generated scripts to look for the rparsebody(0) or perhaps rewriting the implemntation of rparsebody() to not accept unbounded inputs, instead ALL installations of the webserver WHETHER OR NOT THEY ARE USING LUA, must now have a decades-old convention changed (WITH NARY A NOTICE IN SUPPLIED apache2.conf FILES!) to a value which induces behavior which is VERY timeconsuming to track down given the multitude of competing settings in php or other backend solutions which mimmic the same situation, while at the same time affording the admin ZERO error log notices in a vanilla log installation that the situation is occurring. Oh, the fix could also have just been an alternative "LimitRequestBody" directive in mod_lua's conf which it could have been required to use. The possibilities are endless. What shouldn't have happened though is what did happen.

FIX: So, to fix this issue, if you havent run across it already, is to add

LimitRequestBody 0

to apache.conf. This will "magically" bring back the unlimited request size handling behavior that apache has ALWAYS had since the earth cooled and LUA was born (sometime afterwards). The new behavior is that if you omit the LimitRequestBody from your configs, the default behavior of 0 (meaning unlimited) has been removed. It now defaults to 1 GiB, to stop those badguys in thier tracks!

The REAL issue is how this fix is somehow considered an ideal resolution to the initial CVE at hand. Fixes should always address the problem as close to the original cause as possible. Something in a mod_lua config would have been much more appropriate to those of us luddites who don't use the module.

Metamorph
Feb 17, 2024

The industry is changing. Right now, before our very eyes, interesting things are taking hold. No one can truly say where things may end up, and not eveything will be for the better. At least for the moment though, before the storm, it gives us pause to look around and remember the amazing things which have come to pass and have died, are dying, and have helped us get to where we are now.

In loving memory of the free ESXi vSphere Hypervisor

...and for that matter VMWare...

https://news.ycombinator.com/item?id=39359534

Deepmind You
Nov 03, 2023

I really hate the term AI, especially when used in its contemporary setting. At the very best what we see are methods of computational models, statistical combinatronic synthesis, and particular forms of a stochastic parrot at work, be it visual art or text. Eventually, soon to be permeated thoughout the entirety of the range of known human outputs.

Whether it is at all good or not is entirely up to your frame of mind. Two distinct personality type pools come to mind, and I posit that the predominantly sensing to prefer the new x-model creation engines, and likewise the intuitional will shun it in favor of more "genuine" forms of human invention/inspiration.

What lies ahead however is the headlong intersection of this new tech with society and particularly its conflict with existing laws, most specifically copyright.

One such fight is being hashed out via comments on copyright.gov. It is there I of course submitted my two cents.
https://copyright.gov/policy/artificial-intelligence/


I think the question of Copyright and AI has hit a crossroads. Obviously there have been problems with copyright for decades but with the trivialization of the 'creative' process which have been brought to light recently via the use of generative processes to recombine, ad infinitum, various elements much like random notes in a song, we have been left with an introspection into what it takes to create, and it is much less than satisfying. We want to believe in inspiration and I truly believe there is such a concept. There is however at least some fraction of every work, which is taken in at least some trans formative way from past works and experiences. This is inescapably so, and will always be part of the human experience. This however, is not in itself an argument for a conditional form of copyright, one which would otherwise rest on needing to provide ones sources. Since there will always be some form of adaptive plagiarism, we must always look as to whether the new work is sufficiently distinct and not merely a ripoff of some work prior.

The main aspect of what copyright must accomplish is based then on what it at least aims to achieve, that is, an incentive to innovate. Its this incentive which even though at times won't be fully achieved and may not even hold so honestly, still it must persist as the goal, which in turn will be rewarded with the protections of copyright, at least for a time.

Looking at how AI or at the very least generative processes work and recombine elements, and then listen to the derivative technologies, one can only be left with bewilderment. At the very least for a huge range of the processes underlying it few can say with absolute certainty what the aim of the machine was at a macroscopic level. The technology is extremely hard to gauge at the atomic level too, especially so when at scales seen presently. So then what are we to be left with, trying to apply a reward to such a system. How can it value such incentives? At the very least one can expect someone or some group pushing the buttons to reap the rewards of said works. This furthers the strong at the expense of the weak and their government, to crank out marginally unique ideas, a perversion of the system. In the end, sadly, not much can be done to prevent this angle. As groups can simply claim their ideas were self-inspired. It should not however be considered fruitless to enforce, that no conspiratorially generative based system can "claim" copyright over randomly seeded works, in spite of this. Rights should only be awarded to individuals/groups made of individuals, on the grounds that it can inspire and entice further works to be generated.

The second angle of AI with respect to pre-existant copyrighted works regards to their training. I see no reason such a system cannot and shouldn't be allowed to operate. Firstly because in an of itself there doesn't pose any immediate attack on existing works and the copyright system, granted my first angle holds. This again being that generative or non-human systems cannot be used to yield newly copyrighted works. In essence this aspect is taken care of granted the first holds. Unfortunately the future is extremely messy and will be riddled with exceptions of groups trying to subvert both issues, and in bad faith.

IOT Rot
Oct 07, 2023

Back on 25 Sept I submitted comments for the FCC's Cybersecurity Labeling for Internet of Things.

https://www.fcc.gov/ecfs/search/docket-detail/23-239


IOT security has become an issue of concern primarily becuase of scale.
A single device isn't even noisy if compromised, but millions definately are a problem.
From this, its in all of our interests to hold companies to task with regards to whether or not their boxes are potentially contributing to destructive botnets at large.
Since I am a professional and hobbyist developer I always lean on Open Source as not only an immeasureable source of knowledge but also utility for the end user.
I don't see how a forced opensource approach is necessarily the best solution for proprietary market forces at the getgo. However, when balancing an army of abandoned unpatched and proprietary product(s) against one which has been opensourced at abandonment, it very much seems like if some device has been deemed EOL, it should certainly be opensourced so that independant developers would have a shot at salvage lest something catastrophic be found.
The only final question then is how long each period should last before a manufacturer is required to relinquish source if they forgo any further updates. This gets all too compilcated when a product is stagnant for many years, and faces no known vulnerability which needs patching.
I think at present, however, I'm at a loss for concrete numbers as product categories span huge functional and practical realms. Some things need to be regularly replaced, while others less so. Minimally, the update period should be yearly, and then after 3 years without updates they must yield source. While being strict, as making said sources publicly consumable leads to extra work during the initial product outline, the complementing externality is that allowing your device to be ubiquitously attached to the internet en masse must have at least some level of responsibility. Just as oil producers must face that their customers pollute by propelling themselves forward and leaving something behind.

NAT〜4〜LIFE
Sep 20, 2023

I want to do something a little bit different. Never before have I mulled over a situation for so long, incurring so many rewrites/rehashes as I have over this topic, namely IPv6's rollout and how SLAAC misses the boat on a crucial use case. I'm going to use the opportunity to nevertheless state it as quickly as possible, subsequently filling in info later on. The situation is basically that SLAAC misses how network edge operators favor address ranges which are static, singly-homed and centrally controlled MORE than they favor that said endpoints be publicly reachable. That is it, the most succinct, terse forumation for the problem with IPv6 migration, a migration which is by no means endeared though it should be. Its something that we should all be jumping on but rather due to the fact that SLAAC, the uncle that no one asked to come along for the migration, is here, and the options and RFCs are numerous, stagnation ensues. Its why ultimately DHCPv6+NATv6 will eventually come to the rescue as it is the only way to address theese network operator's use cases. One cannot ensure network space will never change (at the edge) for machines which are not on ARIN allocations. This couldn't be more true for last-mile ISP allocated ranges, and if any annecdotes are to be believed, end users will routinely see thier prefixes (and thusly thier entire network) change just as often as thier IPv4 singletons, because despite ISPs not needing to do so they have no financial obligation to ensure it remain static. Its as if one of the cornerstones of IP networking (which has materialized behind NAT for decades) was thrown away, as if nobobdy needed it. But hey, there are now way more public addresses.

So here we are at the juncture, something which we've grown near and dear to has run its course, and the replacement will exceed our expectations in terms of expandability. Yet, part of the way its being offered (by default) misses the exact way in which we've come to depend on our network to operate. Oh, and what's this? No NAT needed you say? Sounds too good to be true. Well, we'd like our addresses to stay put thank you very much. No, multihoming is not a reasonable solution, yes we're aware of RFC6724, and we'll be giddy someday when THAT'll introduce the next security event! The single greatest way to minimize unanticipated network interactions is via a SINGLE, unchanging address. Despite the work put into said RFC, which will make its way to getaddrinfo(), you can't beat an algorithm which can only choose ONE source address. Its pretty hard to fail there. In the meantime, source address selection will have no less than 8 rules which must be assesed before connection initialization. Again, its not that theese are inherently incapable of delivering deterministic results, rather its from the viewpoint of an experienced developer who has seen the myriad ways in which large systems grow over time, caches are introduced, best practices are skirted, even publicly well-known ways of initializing something are ignored. Its hard for me to trust such moons won't align to cause interesting and totally unforseen failure modes which wouldn't even be considered on a singly-homed system. This isn't even taking into account the ways I've seen IPv4 multihoming act up, albeit without such logic as described in the RFC.

Now, I can already hear the retorts as to what people should expect to happen when you don't read the docs, or that no one is stopping us from using DHCPv6+NATv6 for the singly-homed approach. The main problem I have is that in a sea of vendors and interaction points, some may and even have choosen to implement IPv6-SLAAC and call it done. THIS is the problem, and IT is the largest impediment which hinders moving forward. The standard must contain both methods and furthermore if need arises to only allow one I'd opt for the one which is most functioanlly similar to what we have already, that being DHCPv6+NATv6.

Pushing SLAAC as the only way forward (sans NAT) for IPv6 is, in effect shortsighted. It is illustrative of a humanistic tendancy wherein anything run by humans which becomes more widely proliferated incurrs schisms when common elements are endeared by subgroups for different reasons. Things which we had no idea our neighbor loved about our commonalities cause strife when one group decides to change said elements, and so too with technology and protocols. The severe lack of advocacy for edge network admins will only be detrimental in the end and as it stands even at the time of this writing it is just those kinds of networks which adoption lags on.

Cloudy Skies
Aug 20, 2023

I recently contributed to the FTC's RFC for opinions about the "Business Practices of Cloud Computing Providers", and have also attached it here:

The most important thing, i have noticed as a regularly transitioning developer, having worked at a few places who have used cloud computing (AWS) and traditional shops which go full self-hosted, and conducting countless interviews with companies on both sides of the fence, is that despite all the arguments which can be made about any given tool, ie. databases, storage, functions, containers, etc, the negative effects are not immediately easy to spot. This is because they involve not solely the technical elements, but rather the human element.

If all you do is look at any given specific offering, its negligibly arguable that it either is or isn't open. AWS elastic compute does use containerization, and it is portable enough to migrate elsewhere from that in general. Even software defined networking isn't different conceptually. The way in which thees services, notably AWS, create their moats is that they involve an inordinate amount of extra learning to master. This human cost is not inconsequential, as ultimately more and more generations of developers will opt to become cloud developers or AWS/Azure/GCP developers, in stead of generic developers. This has real costs. Just as general database developers are almost nonexistent due to the ubiquity of database software engine packages which can be harnessed via developing "in SQL". Having to create an entire bespoke engine, is a practice that is virtually nonexistent, which isn't necessarily good or bad.

It is, however, illustrative of what will come to pass as cloud compute platforms fully saturate the market over time. New companies, new ideas, and innovative, bespoke industries will be shoehorned into one of a few giant rent-seeking platforms. I have seen this first-hand in the form of startup shops with very green development teams who flat-out wouldn't have the first idea on implementing their setups outside of a major cloud provider, both from the perspective of scale but also network complexity. As time progresses things will just be "done this way", as taking time to architect an independent solution will unfortunately be a thing of the past. This puts huge advantages into the big 3's court, since they know that like Microsoft Windows in the 90s, cloud service computing will become ubiquitous with web development going forward, unless the companies really can assess why staying off it and self-architecting is advantageous. Most will not and the extra knowledge previously required to keep solutions independent will become much less widespread.

The implications for security, are also dire as well. Security, unlike most other areas of development is crucially linked with simplicity. One of the reasons why the web is especially fraught with perpetual insecurity lies in the incessant layering of technologies which ultimately create a security arbitrage, or traversal path which leads to a vulnerability. One need not look far to see how frustratingly over complicated cloud storage is perceived as, in the case of S3. The obvious retort is that once-learned, the automation and auto-deployable/salable capabilities are what outweighs working with simpler systems of yesteryear. This is exactly what providers are leveraging as well. They know that keeping N machines idle at traditional hosting sites is much more expensive than using an elastically deployable container system. However, it is not the case that all metrics end up being cheaper in the cloud, specifically raw storage/transit costs. The last of which is another subtle example of lock-in as traffic within the provider is greatly discounted vs outbound flows.

In the end, the extra vendor-specific knowledge learning costs will be borne by the next generation of developers, implicitly creating lock-in as less and less are able to work with pure deployments and teams become more wary of moving "off AWS", the real tragedy is that security will also suffer as it is the part of software with the least tolerance for complexity. Its not enough for software to "just work" for it to be considered secure.

PSA Mikrotik
Feb 18, 2023

From what I've witnessded firsthand, and what is easily digestable if one were to seek out "Mikrotik" and "Power Supply Problem". Mikrotik has systemic, low qualtiy issues with a vast number of its power supplies which accompany various Routerboards. I've recently had a 10 year old RB2011 UiAS-RM's 12v supply completely fail, after a few years of needing to reboot the device due to undiagnosed crashes. More recently an RB 750 GR3's 24v supply has been suspected to have been the cause of spurious hadware issues. Most specifically dropped ethernet frames after some noisy, re-occurring power event. After which it continues to erroneously drop traffic until rebooted. The jury is actually still out on the last one but after its replacement, the device has become solid for several weeks.

This is not to say that wallwarts need last indefinately, rather, after having several dozen of them over the years and the only two to go are the ones mentioned above, it becomes obvious the devices are less than reliable and should be considered in need of replacement if one is to by a new Routerboard.

Bye uTorrent
Jul 07, 2021

uTorrent 2.2.1's API issue assesment

The following is the evaluation and posture assesment for a vulnerability disclosed by Tavis Ormandy and Google's Project Zero.

The Bug "recently" found/disclosed in early 2018. Could there actually exist a crude API endpoint in the client which would allow the system to be probed unattended!?

For the entire evaluation of this I'll assume a client version of v2.2.1 as anything later than that is garbage, and for obvious reasons. Aparently in the later versions the amount of endpoints seem have been expanded, which of course allows for more potential trouble as well.

The cornerstone to this vulnerability is the realization that in an unfiltered browser (one without a script blocker or application boundary enforcer (ABE)) you can't be sure of what the delivered js is doing. In reality it could in the worst case compromise the integrity of the application and/or at least attempt to do things unbeknownst to the user. This vector, even with v2.2.1 is possible.

The altering of the setting "net.discoverable" to false only closes the predictable port (TCP:10000) where the api listens. This helps lessen the attack surface for web drive-bys by requiring any script to at the very least scan the port space for a service responding with "invalid request". In addition to this port though, there is another way to the API 0.0.0.0:PORT, which turns out to be a on the TCP port of the same number that the client is using to orchestrate the udp torrent protocol traffic on.

This is quite unfortunate as anyone who can see the traffic flow can attempt to access the api via this obvious pairing.

The critical aspect, however, lies in the realization that almost every practical instance of the use of the application will involve a NAT router at least once somewhere upstream. This is certainly true of any use in a home setting and in almost everywhere except for perhaps a data center or rented box with public address. What this will prevent is the forwarding of SYN requests to the global listening instance on 0.0.0.0:PORT I have actually tested this out and have observed zero inbound requests over several weeks of observations and attempts to probe from either side. This means that, as long as the NAT is not being altered to manually allow a port-forward or honoring uPnP (which should absolutely be disabled) then the liklihood of traffic ever reaching it is inconsequential. The website drive-by is however still a valid concern and is enough to warrant the abandonment of the application.

Moreover the use of such an api (especially in later versions) and unclear purpose should be explained and at least allowed to be disabled wholesale as it doesn't appear to be a valid/necessary part of the torrent protocol. If such a "local" messaging system is still needed another approach should be used.

At least this should lead some insight as to the liklihood of an unattended system takeover. Where the application is behind a NAT, there is less of a concern, however NAT should definately not be seen as a valid form of protection, rather propper application design is paramount.

Lastly, I did witness utorrent attempting to reach the pub-address:PORT for the api periodically as some sort of connectivity test. The flow would always fail to synchronize as the NAT lacked an entry back to the true origin, as one would expect. No inbound SYNs were ever witnessed trying to reach the local application from the outside so it does not appear to be a critical part of the torrent protocol as normal peer activity still happened completely unabated.

Links:

So in closing, while there are definately concerns about the uTorrent v3.X line containing a remotely reachable api server without any access restrictions which can be also reached via scripts while browsing without noscript (or equivalents), the assumptions made by the initial researcher may or may not entirely implicate what has or hasn't been a weakness for the venerable v2.2.1 line. Out of the box none of the suspect URLs crash the older variant (my own testing and others) and many of the more advanced endpoints don't exist for it as well. Nevertheless, the existence of an RPC endpoiont sitting on the TCP port of the same number as the UDP bittorrent negotiation port is unacceptable. If for no other reason than it is an unnecessary, undocumented, proprietary interaction mechanism. Additionally, It is something that, if you were running the client on a public address, would completely open the endpoint up to anyone who is willing to probe it. The reality of this is not, again, straightforward as almost eveyone runs behind NAT, and at least a sensible NAT without uPnP will not have any mappings to the open TCP api endpoint on the same port number as the UDP bittorrent protocol port. This can and has been tested by myself in the form of trying to reach the port from the outside of the NAT and also watching anything inbound on the other side (for several hours on multiple occasions). In both cases it checks out. The port isn't open on the NAT and no traffic is witnessed on the inside. This critical aspect is something absent in all discussions about the threat posturing assesments for a user's existing system with this vulnerability.

Usually, upon learing of a recently disclosed or discoverd vulnerability its extemely important for users to asses the pre-existing level of exposure they may have inadvertently been party to, irregardless of the full severity of the situation. In this case, as in others, I think theres plenty of room for variation and given that it requires many elements including network topography, browsing behavior / setup and the like, its not always immediatly beneficial to rush to the conclusion that the worst has occurred as the user may have been running a setup similar to whats described above on a machine with little to no browsing activity or where the browsing environment is well locked-down via ABE. However, dispite this its more than time to consider the application unfit, and move on as it will not be patched.