[00:28:18] <Bender> [SoylentNews] - One Dead After Poop Transplant Gone Wrong, FDA Warns - http://sylnt.us - sanitation
[01:55:56] <Bender> [SoylentNews] - Crop Breeding Sped Up Using LED Growth Chambers - http://sylnt.us - just-like-chickens
[03:25:33] <Bender> [SoylentNews] - Critical Bug in Infusion System Allows Changing Drug Dose in Medical Pumps - http://sylnt.us - infused-with-bugs
[05:06:50] <Bender> [SoylentNews] - Some YubiKey FIPS Keys Allow Attackers to Reconstruct Private Keys - http://sylnt.us - not-so-compliant
[06:37:59] <Bender> [SoylentNews] - VR and Microscopy Help Scientists See 'Inside' Diseases - http://sylnt.us - X-rauy-glasses
[08:06:09] <Bender> [SoylentNews] - Hackers Behind Dangerous Oil and Gas Intrusions are Probing US Power Grids - http://sylnt.us - playing-with-fire
[09:36:47] <Bender> [SoylentNews] - Netflix Puts a 'Patriot Act' Episode About Bad Internet Access on DVD - http://sylnt.us - delicious-irony
[11:06:24] <Bender> [SoylentNews] - Porn Trolling Mastermind Paul Hansmeier Gets 14 Years in Prison - http://sylnt.us - prenda-time
[12:47:41] <Bender> [SoylentNews] - Domino's Will Start Robot Pizza Deliveries in Houston This Year - http://sylnt.us - you-got-30-minutes
[13:09:41] <FatPhil> https://petapixel.com
[13:09:41] <upstart> ^ 03NASA Finds 'Star Trek' Starfleet Logo on Mars
[13:56:33] <Bender> [SoylentNews] - Volvo Testing Automated Electric Truck in Sweden - http://sylnt.us - is-this-an-ad?
[14:54:50] <FatPhil> woo woo! beers are on me!!!!!
[14:55:43] <FatPhil> Fucking useless shitshow of a 'bank' Santander have grudgingly admitted that they fucked up and had no right to charge me #133 in fees.
[15:18:14] <SemperOSS> FatPhil: After moving to the UK, I find that banks here cause more difficulties than banks where I come from. Part of it is statutory bureaucracy, part of it is tradition and part of it is British stuck-up-ness. I don't know how banks handle things in Spain, but I sure feel that Santander has completely gone native with their business in the UK.
[15:22:58] <FatPhil> Their bureaucracy ain't that good - it was my g/f who not only discovered the issue, but got it fixed - they never asked her to identify herself!
[15:26:13] <Bender> [SoylentNews] - Ajit Pai Says NOAA and NASA Are Wrong About 5G Harming Weather Forecasts - http://sylnt.us - two-minutes-hate
[17:05:57] <Bender> [SoylentNews] - Facebook Announces Libra Cryptocurrency - http://sylnt.us - capricorn-or-taurus-but-not-libra
[18:26:59] <Bender> [SoylentNews] - June 2019 TOP500 List: All 500 Systems Above 1 Petaflops - http://sylnt.us - MOAR-POWER-[caveman-grunt.wav]
[20:08:16] <Bender> [SoylentNews] - 'Puppy Dog Eyes' May Have Evolved Just to Make Humans Melt - And It's Working - http://sylnt.us - can-I-have-a-treat-pleeeeease?
[21:20:24] <chromas> I thought search in vi(m) was / but it doesn't work like less
[21:23:27] <SemperOSS> chromas: Hey, what, wait! Search is / in vi(m)
[21:46:03] <Bender> [SoylentNews] - Google Pledges to Build 15,000+ Homes in San Francisco - http://sylnt.us - no-appreciation
[21:48:50] <FatPhil> if search isn't /, then vim's broken vi
[21:51:21] * FatPhil wasted a whole day trying to turn utf-8 beer and brewery names into ASCII equivalents for easy 7-bit grepping.
[21:51:58] <FatPhil> (must run on a machine that has very few libraries and tools, i.e. had to roll my own)
[21:56:03] <FatPhil> damn, I just noticed, that it did already have iconv on it, that should have worked.
[21:56:59] <FatPhil> Hmmm, but doesn't, so not a wasted day
[21:57:04] <FatPhil> iconv: illegal input sequence at position 546
[21:57:33] <FatPhil> what's illegal about this: 0001040 e n 342 200 231 t w e l l i n t e
[21:59:16] -!- AzumaHazuki [AzumaHazuki!~hazuki@loj-076-051-71-829.neo.res.rr.com] has joined #Soylent
[21:59:16] -!- AzumaHazuki has quit [Changing host]
[21:59:16] -!- AzumaHazuki [AzumaHazuki!~hazuki@the.end.of.time] has joined #Soylent
[22:06:39] <SemperOSS> FatPhil: Are you using the TRANSLIT option? Like "iconv -f UTF-8 -t ASCII//TRANSLIT ..."
[22:09:39] <SemperOSS> echo 'â™t well inte' | iconv -t ASCII//TRANSLIT produces the following output on my machine: a?t well inte
[22:10:23] <FatPhil> Ah, it all makes sense - it was written by Uli "your system's carp!" Drepper
[22:10:23] <SemperOSS> s/LIT/LIT"/
[22:10:27] <exec> <SemperOSS> echo 'â™t well inte' | iconv -t ASCII//TRANSLIT" produces the following output on my machine: a?t well inte
[22:10:55] <FatPhil> I wasn't using //TRANSLIT (I've never used iconv before, taht's why I didn't think of using it)
[22:11:05] <FatPhil> It shuts up when I use that switch.
[22:11:31] <SemperOSS> Sometime you only need a little nudge to get on with a job, eh?
[22:11:46] <SemperOSS> s/me/mes/
[22:11:48] <exec> <SemperOSS> Somestime you only need a little nudge to get on with a job, eh?
[22:12:07] <FatPhil> So when there's a problem with the program's ability to *output*, the error message complains unambiguously about the *input*.
[22:12:22] <FatPhil> Thanks, Uli, you're the best.
[22:12:24] <SemperOSS> I can't get it right. Well beyond my bedtime is my excuse
[22:12:41] <FatPhil> Ditto here. My brain has been slowly dissolving all day
[22:13:53] <FatPhil> Fortunately, it wasn't wasted effort - //TRANSLIT is useless for the task:
[22:13:54] <FatPhil> - name: Hummel-Br?u R?ucherla M?rzen
[22:13:55] <FatPhil> + name: Hummel-Brau Raucherla Marzen
[22:13:55] <SemperOSS> As so many other programs, it has its little quirks
[22:14:22] <FatPhil> ? ain't useful to man nor beast.
[22:14:46] <SemperOSS> Let me just check that ...
[22:18:03] <SemperOSS> It works on my machine. Do you have your system character set as UTF-8? If not use iconv -f UTF-8 -t ASCII//TRANSLIT
[22:18:37] <SemperOSS> Actually, just use iconv -f UTF-8 -t ASCII//TRANSLIT
[22:19:18] <FatPhil> that's the invocation I'm using. But I get shit like this: name: U T?? R??? Kl??tern? Speci?l Sv. Jilj? No.4
[22:21:39] <SemperOSS> Strange
[22:22:05] <SemperOSS> I have literally just tried it on my system without any problems.
[22:22:13] <FatPhil> Uli's outdone himself this time!
[22:22:28] <FatPhil> help is --help, but -? rather than -h
[22:22:31] <FatPhil> which means:
[22:22:33] <FatPhil> phil@bazspaz:rb$ touch './-x'
[22:22:33] <FatPhil> phil@bazspaz:rb$ iconv -?
[22:22:33] <FatPhil> iconv: invalid option -- 'x'
[22:22:35] <FatPhil> Try `iconv --help' or `iconv --usage' for more information.
[22:23:13] <SemperOSS> echo '>à á â á à ä æ<' | iconv -f UTF-8 -t ASCII//TRANSLIT
[22:23:13] <SemperOSS> >a a a a a a ae<
[22:24:51] <FatPhil> my iconv is ancient
[22:25:20] <FatPhil> 2.13 from 2011
[22:25:51] <SemperOSS> Mine is the latest Debian. That might be the problem
[22:26:20] <SemperOSS> Version 2.24 from 2016
[22:27:26] <FatPhil> c'est la vie - this machine no longer has a repo to update from
[22:28:34] <FatPhil> I'm planning to trash it RSN, before it self-implodes, but I can't get its replacement to boot yet (DHCP it too slow, and it does weird fallback shit, and can't be told to wait longer)
[22:29:48] <FatPhil> There's a reason FOSS has this attached:
[22:29:50] <FatPhil> This is free software; see the source for copying conditions. There is NO
[22:29:50] <FatPhil> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[22:31:46] <SemperOSS> Well, that's what it is. I tried something else, which did not prove useful either: "echo '>a ä å<' | tr 'aäå' 'aaa'" which gives ">a aa aa<"
[22:33:00] <SemperOSS> echo '>a ä å<' gives >a ä å<
[22:33:25] <SemperOSS> Ergo, tr has some UTF-8 quirks
[22:34:21] <SemperOSS> Do you have a compiler for that machine? You must have as you cooked a version yourself, you said ... or do I misremember?
[22:35:31] <SemperOSS> This could be solved with a *very* small C program
[22:37:05] <FatPhil> I can't see your 8-bit bits in this ssh/tmux/irssi combo
[22:37:52] <FatPhil> However, tr definitely wasn't utf-8 clean historically.
[22:38:13] <SemperOSS> No and apparently not quite, still.
[22:39:54] <FatPhil> I do have C, but I'm scripting in perl and lua presently.
[22:39:57] <SemperOSS> If you send me a spec of from to in some nice format, I could have a C program fixed by tomorrow, some time ... needs a C compiler on your part, though
[22:40:42] <SemperOSS> Perl is born to do this if you make input and output UTF-8
[22:41:01] <FatPhil> At the moment, I'm treating each new character individually as I encounter it, it's not obvious what I will want each non-ASCII thing to map to.
[22:41:38] <SemperOSS> No, but when you have to, use perl to do it
[22:41:55] <FatPhil> The LATIN stuff's pretty easy, but some bellends think that putting unicode-art in their beer names is clever.
[22:43:07] <FatPhil> The script I wrote today doesn't even bother with utf-8, my mappings are shit like this:
[22:43:10] <FatPhil> '\xC4\x99' => 'e', # LATIN SMALL LETTER E WITH OGONEK
[22:44:33] <FatPhil> i have more colourful comments too: '\xE2\x84\x96' => 'No.',# in a twatty font - braindead
[22:44:38] <SemperOSS> Why don't you have the UTF encoding directly in the perl?
[22:46:33] <FatPhil> Firstly because the mapping was ripped out of code I was using nearly 10 years ago, when perl wasn't so utf-8 friendly, and secondly as I don't see the need to change. All the tables I'm referring to have the explicit escaped utf-8 in them.
[22:48:01] <FatPhil> That means I don't need to interpose a utf-8 decoding layer. it might be all-but-invisible, but that doesn't mean it's free.
[22:48:05] <SemperOSS> That doesn't make your life easier, but how do you know what to convert and what not, then? I presume that's part of the already established logic, or ...?
[22:48:33] <FatPhil> I only need to convert characters found in beer names that I've drunk.
[22:48:50] <FatPhil> because I like drinking beer names!
[22:49:10] <SemperOSS> And you are using "binmode(STDOUT, ":utf8"); binmode(STDERR, ":utf8");" to make the output UTF-8 for real?
[22:49:46] <SemperOSS> Yeah, I always eat the label and pour the rest into the sink ;-)
[22:50:21] <FatPhil> Output is ASCII, for use on lowest-common-denominator devices.
[22:50:42] <FatPhil> (typically busybox on my phone)
[22:55:32] <SemperOSS> I'm trying to get a complete hold on the problem here. Does "s/\xC3\xA2/a/g" not work?
[22:57:09] <SemperOSS> If that is the case, maybe your perl thinks the data is UTF-8 and represents it internally as such, which means you have to use "use utf8;" and specify the conversions as UTF-8 strings.
[22:57:56] <FatPhil> Perfectly - but I needed to make sure \xC3\xA2 => 'a' is in my %utf8_replacements: s/($utf8_regexp)/$utf8_replacements{$1}/go;
[22:59:51] <SemperOSS> Argh. You could write a perl script to do that :-)
[23:00:34] <FatPhil> For the LATIN ones, it would have been very tempting, but I had 90% of those already handled in my 10-year old script
[23:01:02] <FatPhil> This task was easy compared with the task 10 years ago.
[23:04:34] <SemperOSS> I'm afraid it is too late for me now
[23:04:38] <FatPhil> 10 years ago, the input data contained some \xe6 latin 1, some \\xe6 escaped latin one, some &#lt; HTML entities, some &#083; numeric HTML entities, some &#40 broken HTML entities.
[23:05:00] * SemperOSS needs his beauty sleep ... Not that it helps, but ...
[23:05:38] <FatPhil> ditto, alas I'm in pain, so might need to self-medicate...
[23:06:52] <SemperOSS> I'm afraid I have to let you work on it yourself for now, but i'll probably be back tomorrow if you need something (my wooden head) to bang ideas against
[23:07:27] <SemperOSS> I hope you'll manage your pain and have a nice night after all.
[23:07:35] <FatPhil> All my beer reviews are handled now, so I'm happy.
[23:08:07] <SemperOSS> And don't have silly dreams of HTML-escaped Unicode characters written in Hex-ASCII-UTF-8
[23:08:18] <FatPhil> I'm also mirroring some friends' beer reviews, no doubt there will be some new dodgy characters in theirs.
[23:08:28] <FatPhil> IN XML!!!!
[23:08:44] <SemperOSS> JSON!
[23:09:36] <SemperOSS> And it should really have been Octal-Hex-ASCII-UTF-8
[23:09:56] <SemperOSS> I'm off now. Laters!
[23:12:17] <FatPhil> n.d. https://www.ratebeer.com which will probably throw an 8-bit spanner in the owrks tomorrow!
[23:12:17] <upstart> ^ 03RateBeer
[23:27:20] <Bender> [SoylentNews] - World's Largest Plant Survey Reveals Alarming Extinction Rate - http://sylnt.us - seeds-of-change?
[23:38:22] <Bytram> FatPhil: I'm late to the party, but IIRC, there is a "thing" in Unicode circles called "confusables" which has mappings of characters that could be (very loosely, even) confused with another character. Consider diacritical marks like accent grave, acute, diaresis, etc. ISTM that a subset of that would do the trick for you.
[23:42:39] <Bytram> First place I've found on-line: http://www.unicode.org
[23:42:40] <upstart> ^ 03UTS #39: Unicode Security Mechanisms ( http://www.unicode.org )
[23:42:51] <Bytram> Still looking for the exact file I was thinking of
[23:44:29] -!- AzumaHazuki has quit [Remote host closed the connection]
[23:45:30] -!- AzumaHazuki [AzumaHazuki!~hazuki@loj-076-051-71-829.neo.res.rr.com] has joined #Soylent
[23:45:30] -!- AzumaHazuki has quit [Changing host]
[23:45:30] -!- AzumaHazuki [AzumaHazuki!~hazuki@the.end.of.time] has joined #Soylent
[23:48:56] <Bytram> FatPhil: Here's the place I was looking for: http://www.unicode.org and, specifically, this file: http://www.unicode.org
[23:48:56] <upstart> ^ 03Index of /Public/security/latest
[23:57:05] <Bytram> scroll down to about line 1700 and you will see a list of all different kinds of things that could potentially confused with: "LATIN CAPITAL LETTER A"
[23:57:48] <FatPhil> Bytram: there's some good stuff in that file - the apostrophes/quotes and dashes have been a bit of a bugbear today.
[23:57:57] <Bytram> =)
[23:59:57] <Bytram> I found it to be quite, please forgive me, "entertaining" as to all the various ways one could attempt to pass off one string of characters as being another. The classic being "paypal" but using Cyrillic (IIRC) letter "a" instead of latin letter "a".