Curate your own web, search your own web with lieu
Instead of using others’ search engine and searching massive indexes, and always landing on the same websites, I thought I would like to search my own curated index/list of websites.
In a way, when I am searching for something, I thought I’d like to search my own neigborhood (populated with websites I curated) rather than the whole world using “search engines”.
I am trying to set up my own search engine and index with the search engine called ‘lieu’.
EDIT 23/08/2023: although lieu works as expected, I don’t use it. I see two reasons:
- I have not figured on to add websites to the index without rebuilding the index from scratch (this tasks takes more than a day as far as I can remember);
- I have not made lieu the default search engine in my default browser; starting
/.lieu
and opening a browser pointing to local host seems to be a barrier. When I am looking for something; I tend to (1) open the default browser, and then (2) run my query in the address bar.
This is what I’ve done so far (this is WIP), and how you can try to reproduce:
Download and extract ‘lieu’:
$ wget https://github.com/cblgh/lieu/releases/download/2022-03-07/lieu-linux.tar.gz # update with the latest release
$ tar -xvzf lieu-linux.tar.gz
In the configuration file lieu.toml
you can update name
; that is the just the name of the lieu’s instance you want to run. I did update the rest yet as suggested in the README. You can also update the following values to customize lieu’s appearance: tagline and placeholder.
Next, populate the file data/webring.txt with the list of websites you want to feed the database of your search engine, like this for example:
https://yctct.com
https://technofeudalism.fyi
https://agency.yctct.com
https://plaintext.website
Or copy the list of websites you already had curated to the file data/webring
:
$ cp path/to/your/list.txt data/webring.txt
Then follow the instruction of the README file and run:
$ ./lieu crawl > data/crawled.txt
$ ./lieu ingest
$ ./lieu host
Now, open your web browser and type the address:
localhost:10004
Next
Next step for me is to add a cronjob maybe to start lieu when I start my system, and add lieu as the homepage of my web browser, or maybe figure how to run a search query from the command line and get the results in the browser. Get in touch if you are trying to set up your own instance. I will like to know if you have done it differently.
I also need to figure how to only add incremental additions to the database. I tried running $ ./lieu crawl > data/crawled.txt
after I had added some new websites to data/website.txt
but this command started the whole process from scratch.
personal computing command-line interface (cli) gnu linux trisquel shell literacy office applications wiki digital literacy lieu