nonotmeplease's comments

nonotmeplease · on Dec 13, 2015

Selling t-shirts http://www.zazzle.com/freesoftwarelove*

mod · on Dec 13, 2015

Doing any good?

nonotmeplease · on Dec 13, 2015

Is look of any use? Isn't grep fast enough?

LukeShu · on Dec 13, 2015

A better one-line description would be

"look(1): binary search for lines with a given prefix in a sorted file"

Which tells you exactly what the pros/cons are compared to plain grep.

nonotmeplease · on Dec 13, 2015

I created a 1.1G file with dictionary words and timed...

  % du -h a  
  1.1G	a
  % time look 'dog' a | wc -l
  53856

  real	0m0.021s
  user	0m0.020s
  sys	0m0.003s
  % time grep '^dog' a | wc -l
  53856

  real	0m28.593s
  user	0m0.977s
  sys	0m2.223s

Ok grep performed worse than what I expected (sort took a long time though).

DougMerritt · on Dec 13, 2015

'look' can stop working when it finishes the sorted section of 'dog', whereas grep must continue to the end of the file.

Ignoring memory hierarchy, naturally 'look' would be O(log n) while grep would be O(n).

Special input (sorted in this case) of course often calls for matching algorithms, when speed matters.

ambrosio · on Dec 13, 2015

That seem abysmally slow for grep. On my machine, with a 1.1G file made of 1200 copies of /usr/share/dict/words, GNU grep 2.16 takes about 1.5 seconds. Try with LC_ALL=C?

justinator · on Dec 13, 2015

Is grep optimized to be fast, though? There seems to be a lot of, "better at grep than grep" tools out there,

http://beyondgrep.com/

hyperpape · on Dec 13, 2015

There's some real effort that was put into it: https://lists.freebsd.org/pipermail/freebsd-current/2010-Aug.... I don't know how it compares to all the alternatives.

mveety · on Dec 13, 2015

Yeah there's a lot of that, but grep is like using duct tape to hold something together when you really should be using screws or so. grep can do everything, but its not always the best tool for the job.

DougMerritt · on Dec 13, 2015

The "ack" you reference is apparently faster only in that it's easier to specify exactly which files to search or not to search, which obviously can also be done with grep, e.g. combined with 'find'.

As for grep itself, it has had many incarnations over the decades with various pluses and minuses.

The current GNU grep allows 3 kinds of regex, basic/extended/perl -- the latter being what ack supports.

Note that Perl regexes have extensions beyond regular languages that are inherently slower than the automata specified by basic regexes. Power versus speed.

E.g. grep(1): "Back-references are very slow, and may require exponential time."

For further info:

> why GNU grep is fast > Mike Haertel mike at ducky.net > Sat Aug 21 03:00:30 UTC 2010

> Here's a blog post from 2006 about a developer trying to "beat grep" and looking at the algorithms it uses; it goes into a little more detail about the "doesn't need to do the loop exit test at every step" optimization mentioned in this email.

http://ridiculousfish.com/blog/posts/old-age-and-treachery.h...

via

https://lists.freebsd.org/pipermail/freebsd-current/2010-Aug...

The best writeup is surely by the inimitable Russ Cox, who really really explains clearly when grep as of 2007 was one of the only fast regex implementations:

Regular Expression Matching Can Be Simple And Fast [#1] (but is slow in Java, Perl, PHP, Python, Ruby, ...)

Russ Cox 2007 jan

http://swtch.com/~rsc/regexp/regexp1.html

(This is a 4 part series but IIRC part 1 has the highlights)

I'm sure that various other tools have been strongly influenced by this famous essay, and so many more things may be as fast as grep by now, but still...

P.S. one of the other high profile "ack"-like search tools would be "ag", aka "The Silver Surfer".

"The Silver Searcher is a 3-5x faster drop in replacement for ack (which itself is better than grep)."

2013

https://www.reddit.com/r/programming/comments/16bvah/the_sil...

robryk · on Dec 13, 2015

Seems that most time for grep was spent waiting on disk. This underlines the reason look may be significantly faster: actually reading the bad lines may take time.

nonotmeplease · on Dec 13, 2015

Minimalism is underrated imho. Just because is simple, doesn't mean it sucks. Most sites are so full of crap they take minutes to fully load. My point is that you can build a site that covers 90% of your needs in 5 minutes. Wheter typed.pw looks good or bad is a question of taste. I like it, the most important aspect for me is readability anyway. What I absolutely dislike is sites that are too "modern". They mess with the scrollbar, use lightboxes/popups, use the same color for normal text and links, make textfields almost indistinguishable from the rest of the page, etc.

My main point is that we fail to use the tools we have properly and build something that is more powerful than needed. And sometimes this is damaging too (e.g. sites taking minutes to load, etc.).

BTW not having features sometimes IS a feature. YMMV.

nonotmeplease · on Dec 11, 2015

I sent you an email yesterday, did you receive it? :)

nonotmeplease · on Dec 10, 2015

Thank you. I started reading some stuff (see some of my submissions). I also started learning elixir. Really fun language.

nonotmeplease · on Dec 10, 2015

GNOME is part of the GNU project, as well as emacs, gtk, etc.

eropple · on Dec 10, 2015

I am well aware of that. I don't use GNOME or GTK (indeed, on my Linux desktop libglib isn't even installed).

cosarara97 · on Dec 10, 2015

What are you running, Alpine?

eropple · on Dec 10, 2015

I'm running Debian with KDE. The GNU tools are installed but by weight they're nowhere near the majority of the code on the system, to say nothing of the components I actually use. So calling that machine "GNU/Linux" instead of "KDE/X.org/Linux", given the relative importance of the individual components, would be stupid.

Fortunately, we have the nearly universally understood "Linux" instead, and weird specializations like Android can be called something else.

e12e · on Dec 11, 2015

I agree that "naming all the things" is a bit silly. But I think GNU/Linux indicates Linux kernel, GNU libc, compiled with gcc -- which is actually kind of useful information. It doesn't say anything about the Graphical UI (if any). Similarly I think Android/Linux is useful, because it indicates something about what kind of (binary) software you can expect will work - and what will not.

I also happen to think distinctions such as Debian/kFreeBSD (a Debian distribution based around the FreeBSD kernel, as opposed to (just) the regular FreeBSD user-land) are informative.

I'm not sure what one should/would call a Linux distribution with an alternative (non-GNU) libc that relied on a non-GNU compiler chain... Probably "brandname"/Linux or "function"/Linux (eg: Linux Router Project or something)...

eropple · on Dec 11, 2015

This isn't a technical thing for any of the major proponents, though. It's a marketing thing, because after the spectacular failure of Hurd GNU was relegated to effectively a sideline (a useful one, for sure, but a sideline). I get a lot more direct and personal value out of code not provided by GNU than I do by code provided by GNU. I don't think they merit top billing, and I don't think the two decades of holding one's breath and turning purple deserves a reward.

quadrangle · on Dec 11, 2015

Look, the Linux kernel itself uses the GNU GPL, so GNU probably is the most important factor in all of this. Linux kernel under a proprietary license or even under BSD-style, you probably would never have heard of it. At any rate, basically nobody thinks HURD is useful anymore. RMS thinks time spent on HURD is a waste since it is totally unneeded now that we have the Linux kernel. GNU is a political movement about software freedom as much or more than a particular set of software, and even KDE is licensed with a GNU license. etc.

eropple · on Dec 12, 2015

"Look," the creator of Linux could not give the faintest fart about GNU and has said he'd use a different license if he could. Why should a political movement that specializes in lousy marketing campaigns and haranguing be given a nod?

I do like that attempt, though. "Well, they used a license that GNU wrote, so GNU should get top billing!" Do you want to talk about how well Apache/Dropwizard and PostgreSQL/PostgreSQL work with Apache/Kotlin on GNU/OpenJDK8?

quadrangle · on Dec 12, 2015

> has said he'd use a different license if he could

Citation?

A constantly repeated quote from Linus is precisely: "Making Linux GPL'd was definitely the best thing I ever did." from a 1997 interview according to https://en.wikiquote.org/wiki/Linus_Torvalds

I've never heard Torvalds say anything regretful about using GPL. When I heard him in person just last year, he clarified his dislike of GPLv3 while emphasizing his preference and like of GPLv2.

And anyway, I didn't say anything about "top billing". The fact is, Linux itself is, admittedly, absurd billing for Linus given that he is the leader of the kernel project but is among thousands of people who make it happen. GNU is not a term that credits Richard Stallman. GNU is a community project with a particular political aim. And my point all along was just about some practical way to differentiate Android from the other primary Linux-based systems, and "GNU/Linux" is a way to do that.

nonotmeplease · on Dec 10, 2015

Why is so hard to build a similar community somewhere else? (I don't plan to but...) Suppose I wanted to build a community like HN, how do I proceed?

yen223 · on Dec 10, 2015

It's the old chicken-and-egg problem: you build a community by attracting new users, but new users will join only if there's already a good community. It's hard to get a community started.

insoluble · on Dec 13, 2015

Strong moderation with clearly defined goals and rules could go a long way, granted you had the resources to back it up long enough for it to become self-sufficient. Stack Exchange has some categories with strong moderation and rules. The problem with SE, in my opinion, is that posting there feels literally like working without being paid. There's almost no fun in posting answers on SE, while there is still a bit of fun posting on HN.

DrScump · on Dec 10, 2015

What do you want in a community that doesn't already exist somewhere?

nonotmeplease · on Dec 9, 2015

What happens if write a program to generate all data in the world? 000000 000001 000010 000011 000100 Do I own every file?

lazaroclapp · on Dec 9, 2015

No, because you didn't create those other files, your enumerated bits won't have the right color ;)

"Bits don't have Colour; computer scientists, like computers, are Colour-blind. That is not a mistake or deficiency on our part: rather, we have worked hard to become so. Colour-blindness on the part of computer scientists helps us understand the fact that computers are also Colour-blind, and we need to be intimately familiar with that fact in order to do our jobs.

The trouble is, human beings are not in general Colour-blind. The law is not Colour-blind. It makes a difference not only what bits you have, but where they came from. [...] The law sees Colour.

Suppose you publish an article that happens to contain a sentence identical to one from this article, like "The law sees Colour." That's just four words, all of them common, and it might well occur by random chance. Maybe you were thinking about similar ideas to mine and happened to put the words together in a similar way. If so, fine. But maybe you wrote "your" article by cutting and pasting from "mine" - in that case, the words have the Colour that obligates you to follow quotation procedures and worry about "derivative work" status under copyright law and so on. Exactly the same words - represented on a computer by the same bits - can vary in Colour and have differing consequences. When you use those words without quotation marks, either you're an author or a plagiarist depending on where you got them, even though they are the same words. It matters where the bits came from." - from http://ansuz.sooke.bc.ca/entry/23

Basically, non-recorded metadata about how a sequence of bits was created matters too. Not just the bits themselves.

mmastrac · on Dec 10, 2015

This is an amazingly good description on the difference between the digital and real world. Thanks for linking this.

cskau · on Dec 10, 2015

Why not go full infinite monkey and do:

  cat /dev/random > /dev/eth0

Now you've simultaneously copyright infringed all the worlds works by sharing them on the internet, AND generated all other works, making it impossible for anyone else to ever not infringe on your work.

Isn't the world a wonderfully messy place?

</hyperbole>

peterhadlaw · on Dec 10, 2015

I like this.

WaylonKenning · on Dec 9, 2015

I remember someone created a P2P file sharing system that used 'munges' of files to create blocks of data by themselves that have no meaning (a bit like http://monolith.sourceforge.net/). These blocks were then transferred around the network, and people could claim "I'm not transferring files, I'm transferring meaningless blocks of data".

Sure, but (and I'm not a lawyer), intent of law is just as important as the literal meaning of law. Cases play out this intent and add to the corpus of knowledge as case law. So if you did write a program to generate all the data in the world, I'd imagine people would look at your intent, rather than just what you literally did.

saganus · on Dec 9, 2015

I believe you are talking about Freenet [1], right?

[1] https://en.wikipedia.org/wiki/Freenet

_lce0 · on Dec 9, 2015

Illegal numbers!

https://en.wikipedia.org/wiki/Illegal_number

sleepychu · on Dec 9, 2015

Even if you've never generated the files?

reedlaw · on Dec 10, 2015

You'd run out of disk space before you even generated "Hello, World". Assuming ASCII only, that's 96 bits. In order to save every possible character combination you'd need about 9903 yottabytes [1] of storage.

1. https://en.wikipedia.org/wiki/Yottabyte

jasonszhao · on Dec 10, 2015

This reminds of Pi file system, where everything is already stored, but you simply reference it.

Lawtonfogle · on Dec 10, 2015

If you went through and sorted those which were pleasant from those which weren't, you would own all those you sorted. Good luck getting even one picture that looked anything more than static or pure color.

weaponofchoice · on Dec 10, 2015

Already being done commercially http://www.qentis.com

nonotmeplease · on Dec 9, 2015

http://dspace.mit.edu/handle/1721.1/44215