July 15, 2024

Curate your own web, search your own web

Instead of using others’ search engine and searching massive indexes, and always landing on the same websites, I thought I would like to search my own curated index/list of websites.

In a way, when I am searching for something, I thought I’d like to search my own neigborhood (populated with websites I curated) rather than the whole world using search engines”.

I am trying to set up my own search engine and index with the search engine called lieu’.


EDIT 23/08/2023: although lieu works as expected, I don’t use it. I see two reasons:

  • I have not figured on to add websites to the index without rebuilding the index from scratch (this tasks takes more than a day as far as I can remember);
  • I have not made lieu the default search engine in my default browser; starting /.lieu and opening a browser pointing to local host seems to be a barrier. When I am looking for something; I tend to (1) open the default browser, and then (2) run my query in the address bar.

This is what I’ve done so far (this is WIP), and how you can try to reproduce:

Download and extract lieu’:

$ wget https://github.com/cblgh/lieu/releases/download/2022-03-07/lieu-linux.tar.gz # update with the latest release
$ tar -xvzf lieu-linux.tar.gz 

In the configuration file lieu.toml you can update name; that is the just the name of the lieu’s instance you want to run. I did update the rest yet as suggested in the README. You can also update the following values to customize lieu’s appearance: tagline and placeholder.

Next, populate the file data/webring.txt with the list of websites you want to feed the database of your search engine, like this for example:

https://yctct.com
https://technofeudalism.fyi
https://agency.yctct.com
https://plaintext.website

Or copy the list of websites you already had curated to the file data/webring:

$ cp path/to/your/list.txt data/webring.txt

Then follow the instruction of the README file and run:

$ ./lieu crawl > data/crawled.txt
$ ./lieu ingest
$ ./lieu host

Now, open your web browser and type the address:

localhost:10004

Next

Next step for me is to add a cronjob maybe to start lieu when I start my system, and add lieu as the homepage of my web browser, or maybe figure how to run a search query from the command line and get the results in the browser. Get in touch if you are trying to set up your own instance. I will like to know if you have done it differently.

I also need to figure how to only add incremental additions to the database. I tried running $ ./lieu crawl > data/crawled.txt after I had added some new websites to data/website.txt but this command started the whole process from scratch.


personal computing command-line interface (cli) gnu linux trisquel shell literacy office applications wiki digital literacy lieu

No affiliate links, no analytics, no tracking, no cookies. This work © 2016-2024 by yctct is licensed under CC BY-ND 4.0 .   about me   contact me   all entries & tags   FAQ   GPG public key

GPG fingerprint: 2E0F FB60 7FEF 11D0 FB45 4DDC E979 E52A 7036 7A88