This part of the project is easily the most complicated I did, and also took me the most time. There is a quote from a book that I read that ‘information wants to be free’ (a little googling tells me it’s been around for a long time, but I read it in a book by Charles Stross), and this certainly proves it. Trying to make it so kids can’t get to the bad parts of the internet is a good example of this, since you need to do a ton of things to make the filtering work. Here is what I did:
The Goal
The goal is to have a transparent proxy for http and https that keeps my kids away from bad content and redirects them to a friendly error page to tell them that. I wanted to have content screening and filtering, with whitelists to add in what I want them to have access to.
This design is complicated, and it took a while to make me familiar with all the technologies involved. Plus there are some things that I just could not have with my technology.
The Drawbacks
HTTPS makes it impossible (or very hard) to inspect and redirect a request to another site (it’s part of what makes it secure). Therefore, I ended up getting the system to work using the squid as an explicit proxy, and not using the transparent proxy. A transparent proxy listens on ports 80 and 443, while a regular one usually listens on 3128. The transparent proxy ended up not working because it won’t decode the https request to find out what is in the query string. So it would catch ‘https://bigfoot.com’ because it can use a reverse DNS lookup to see where the ip address points to, but not https://google.com/search?bigfoot’. That’s because the transparent proxy has the connections initiated by the client (and it’s therefore encrypted and the proxy can’t read it), but with a ‘regular’ proxy it’s initiated by the server, so it can decode the data.
However, this has the downfall that any apps that use certificates to authenticate with their servers won’t work, because you are forging the responses in the proxy. This seems to be an issue with Amazon and Apple, at least. In general, this adds more security to this particular use case, since kids need to switch to a direct connection to buy apps, etc. It also intrinsically disables in-app-purchases. To bring purchases to an iPad on this network, you’ll need to switch it to another network where it will be able to connect, then switch back.
Design
So below is what I came up with, based on a lot of research and trial and error. The ability to create VLANs quickly was invaluable, since I was able to built this without disrupting my main networks.
Effectively, all traffic on the VLAN is blocked, with only port 3128 open for traffic, thus blocking all traffic unless there is a proxy configured. The proxy server has a dedicated DNS server which handles rewrites to SafeSearch engines so that google and other search engines are forced to SafeSearch.
Squid then initiates the HTTPS connection, inspecting the data both on the way in and out using SquidGuard for content filtering.
When using HTTPS, you can’t get a nice error page – you get a weird error, but that seems to be impossible to fix, again, because HTTPS is trying to guarantee that the connection wasn’t tampered with.
Finally, each client has to be set up to use a proxy, and also has to have my self-signed CA installed and trusted.
At scale, it would be a bit of a pain to administer, but with only a few clients, it’s not too bad. If I were to go farther, I’d set up WPAD (Web Proxy Auto-Discovery Protocol) to automatically set up the proxies, but then I’d need a web server and other stuff I don’t feel like building. Also, iOS saves proxy settings at the wireless network level, so it works for iPads.
Setup
Squid
Setting up Squid is fairly straightforward. I used the following resources, and ultimately landed on a working configuration.
- https://openschoolsolutions.org/pfsense-web-filter-filter-https-squidguard/
- https://forum.netgate.com/topic/100342/guide-to-filtering-web-content-http-and-https-with-pfsense-2-3
- https://turbofuture.com/internet/Intercepting-HTTPS-Traffic-Using-the-Squid-Proxy-in-pfSense
- https://redmine.pfsense.org/issues/6777
- https://forum.netgate.com/topic/54858/pfsense-squid-proxy-error-111-net-err_tunnel_connection_failed-https-only
- https://forum.netgate.com/topic/109699/squid-https-ssl-filtering-2017-is-this-it
The first step is to install Squid and SquidGuard from the package management setup.
Then go to Services -> Squid Proxy Server
On the ‘General’ Page:
- Check ‘Check to enable Squid Proxy’;
- For the interface, choose to listen on your Kids interface;
- In ‘Use Alternate DNS Servers for Proxy Server’ enter the IP of the interface. This will talk to your main DNS server at first (if it’s listening on that interface) but once we’re done it will connect to an unbound DNS Resolver that will handle rewrites.
For the SSL Man-in-the-Middle ‘MITM’ decoding, you’ll need to setup up a self-signed CA (guide here) and check ‘Enable SSL filtering’ and select your CA in the ‘CA’ drop down. Also set the ‘SSL/MITM mode’ to ‘Splice WhiteList, Bump Otherwise’ (I can’t say I fully understand this part…).
Note that each time you change the Squid setup, you ought to restart squid from the main pfSense page. It otherwise doesn’t seem like it picks up changes (though it could be just timing on my part).
SquidGuard
Setup
SquidGuard is basically a set of rewrite rules that plug into squid. I used shalla’s backlist, which seems pretty good, and then I had to whitelist a bunch of stuff.
Whenever you make a change in SquidGuard, you need to click the ‘apply’ button from the ‘General Options’ for it to take effect. Don’t forget this.
Also, logs are your friend here. The way I figured out my whitelists was to try to navigate to a site, then see what was blocked on the logs, and then whitelist it, then try again.
First, you check ‘Check this option to enable SquidGuard’.
Then you check ‘Check this option to enable Blacklist’ and set the ‘Blacklist URL’ to http://www.shallalist.de/Downloads/shallalist.tar.gz. I’m not sure what, exactly, ‘Clean Advertising’ does.
Once that is done, go to the ‘BlackList’ and click ‘download’. It will load and refresh the blacklist.
Then go to the ‘Common ACL’ tab and set the bottom-most item: “Default access [all] to ‘Deny’. Then I set every category I wanted to deny explicitly (I don’t know if that is necessary or if it’s redundant). Set things you want to allow to ‘whitelist’. WhiteLists won’t be inspected and passed through, as opposed to ‘allow’ which will (I think…).
I also set ‘Do not allow IP-Address in URL’ and ‘Use SafeSearch engine’, though the latter doesn’t work in HTTPS, it seems.
WhiteLists
Finally, I added several whitelists (which I set as ‘whitelist’ in the common ACL’), per below, in the ‘Target categories’ section:
Each whitelist has the DNS domains that are needed based on what I found in the logs:
WhiteLists
To get my iPads generally functioning, here are the whitelists I created:
amazon.com:
media-amazon.com amazon.com amazonvideo.com ssl-images-amazon.com akamaihd.net aiv-delivery.net mads.amazon-adsystem.com amazonaws.com cloudfront.net
Google/Youtube
youtube.com googlevideo.com ytimg.com
Netflix
netflix.com nflxso.net nflxvideo.net
Apple
apple.com mzstatic.com icloud.com
DNS
For DNS setup, I mostly followed the guide at: https://openschoolsolutions.org/pfsense-web-filter-filter-https-squidguard/, which was very comprehensive.
Go to Services -> DNS Resolver, and set it up as below (using the guide to fill in any details). The only major issue is to ensure that both BIND and unbound (aka DNS resolver), are not trying to listen on the same port on the same interface (you may need to explicitly change the BIND configure if you had told it to listen on the KIDS VLAN.
You will need 2 host overrides to force Bing and YouTube to use SafeSearch, and then you’ll set up a file with all the google IP addresses, per the guide. This will take any and all searches (except yahoo) and force them to their safe versions. Also, you can also use regex blacklists (available by googling) to further harden this.
Firewall Setup
Finally, you want to block all traffic on that interface except for DNS and Proxy. That will prevent clients from working around the restrictions, and also help ensure that clients are configured correctly.
Client Setup
For clients to work correctly, you’ll need to set them to use the proxy server (enter your IP address and port 3128), and then also you’ll need to trust the self-signed CA that you created.
If you google ‘trust self-signed CA’ you’ll see how. Here is the iOS process: https://www.thesslstore.com/blog/trust-manually-installed-root-certificates-in-ios/.
Once that is done, your clients on that network will be restricted to the proxy, and will transparently be filtered and it will be impossible to navigate to sites where you don’t want them going to.
Summary
This is a lot of work to go through to keep a kid from searching for bigfoot (and other moral threats), but it does work, and it’s free. However, it’s incredibly obtuse and maintaining it will not be something that the average user can do.
Read Part 2 of this where I refine it
What I’m listening to as I do this:
Led Zeppelin. I never really got into Led Zeppelin, not even in boarding school, where it was de rigeur. I was reading a wikipedia article on drummers and it seems that John Bonham was an inspiration to everyone, so I started to listen to all their work. Needless to say, it’s good stuff. Like Queen, I find it to be so varied that I can’t say I like all of it, since there is a huge mix of styles and influences, but there is some great stuff in there.