• 0 Posts
  • 13 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle

  • All of my analysis comes from the US Bureau of Transportation Statistics, particularly information published in 2024. You are right–It is very, very difficult to normalize the data across different modes, however my analysis is specifically over passenger-miles, but I also did it over passenger-hour exposure, which is significantly worse because you can cover more distance in a shorter amount of time.

    I wrote a section on the subject in a paper that is currently under academic review at CHI, but I ultimately cut the section. I should write it into a blog at some point.

    Risk assessment is tricky business, and I’ve spent countless hours in discussion with colleagues over the topic. Humans, even highly attuned academics, are inherently terrible at assessing risk for low-frequency events. While we like to say things like one is 3x more likely than the other, it often lacks the context of scale. A lack of context often results in overly cautious recommendations that encourage people to live like a bubble-boy. I’ve advocated in the past that all academic journals should adopt a common risk metric, like the micromort when reporting on risk.

    There are 800,000 pilots in the US, and an average of 300 deaths per year or 3.75 per 10,000 pilots. There are 243 million drivers in the US and 40,000 deaths per year, or 1.6 per 10,000 drivers. While one is higher than the other, they are still incredibly small frequency events, and our ape brains are not capable of adequately reasoning over that concept.


  • Riding a motorcycle or flying GA is only 3x more lethal than driving and just as lethal as walking (75% of walking fatalities occur at night and are entirely due to cars).

    The big difference is on a motorcycle, the danger is other drivers, whereas for a pilot, the danger comes from themselves as 90% of fatal accidents are caused by pilot error.

    Eliminating pilot error is entirely possible as demonstrated by the commercial airline industry, which ends of being the safest form of travel by multiple magnitudes over cars. If GA pilots can hold themselves to the same rigorous standards as commercial airline transport pilots, then GA can absolutely be much safer than driving.


  • The tokenizer is capable of decoding spaceless tokens into compound words following a set of rules referred to as a grammar in Natural Language Processing (NLP). I do LLM research and have spent an uncomfortable amount of time staring at the encoded outputs of most tokenizers when debugging. Normally spaces are not included.

    There is of course a token for spaces in special circumstances, but I don’t know exactly how each tokenizer implements those spaces. So it does make sense that some models would be capable of the behavior you find in your tests, but that appears to be an emergent behavior, which is very interesting to see it work successfully.

    I intended for my original comment to convey the idea that it’s not surprising that LLMs might fail at following the instructions to include spaces since it normally doesn’t see spaces except in special circumstances. Similar to how it’s unsurprising that LLMs are bad at numerical operations because of how the use Markov Chain probability to each next token, one at a time.


  • This is because spaces typically are encoded by model tokenizers.

    In many cases it would be redundant to show spaces, so tokenizers collapse them down to no spaces at all. Instead the model reads tokens as if the spaces never existed.

    For example it might output: thequickbrownfoxjumpsoverthelazydog

    Except it would actually be a list of numbers like: [1, 256, 6273, 7836, 1922, 2244, 3245, 256, 6734, 1176, 2]

    Then the tokenizer decodes this and adds the spaces because they are assumed to be there. The tokenizer has no knowledge of your request, and the model output typically does not include spaces, hence your output sentence will not have double spaces.



  • Old cars are work for sure, but if you are willing to learn it’s not bad.

    I have a 2007 Mustang. I’ve replaced the entire front suspension, rear differential, alternator, and paid an upholsterer to replace the convertible top. I upgraded the radio and put in a 10inch touch screen with Wireless carplay and integrated backup camera. Next up is dropping the trans to replace the clutch plate, throw-out bearing, resurface the flywheel, and replace the rear main seal on the engine while I’m down there because the flywheel is rusty and accumulates a thin layer of rust every morning that makes a grinding noise for 30 seconds until it grinds off.

    It definitely doesn’t just work like a new car, but since I do the work myself it also doesn’t cost me much.





  • An LLC is a business. There’s no other way around it. The IRS will revoke your LLC if you are not running it as a business or under protected non-profit clauses.

    Don’t take my word for it. Please consult with someone who has owned LLCs or even sole proprietorships for more than 5 years before charging ahead.

    I’ve been running either an LLC or a sole proprietorship for 7 years, but I’m just random random internet person.

    Also 1/3 of tax law are the actual words of any given law. The other 2/3 of tax law is executive interpretation/enforcement and case law from around the country.

    There are some really interesting cases, even where tax lawyer firms get it wrong. In one instance a law firm tried to deduct their daily lunches as business meetings, and the tax court said no, even though it clearly states in the text of law that this is permissable. The judge basically said you can’t declare a daily lunch as a business meeting.

    Other court documents can be found here:

    https://www.taxnotes.com/research/federal/court-documents