Stretching some tech muscles.
Iddle times in which I try to do something tech related are growing scarcer over time, I used to regret this by thinking I was stagnating in terms of skills. It took me quite some time and exchanges with interesting people to realize that what had really happened is that I moved in a profesional direction that now allows me to do the things I like in my work, leaving some space for personal life, something that is very important and increasingly ignored in tech people. Nevertheless every now and then I still indulge into hobby tech fiddling and this was one of those cases.
Some context
I live in Argentina, we currently have one dominant e-commerce platform that attempts to fill the space both e-bay and amazon occupy elsewhere, it’s called Mercado Libre among it’s services it has a free publish items to sell which is selectively ranked down among other for-sale items that have payed for a more premium publication (this is of course completely fair) but there is one fairly new condition to the service that I dont like: they force to use their payment processor, called Mercado Pago. I had rather unpleasant past experiences with it (they have made me loose a lot of money some years ago) and the reviews of their conflict resolution mediation have not improved over the years. With this problem in mind I decided that I would rather publish the item in my own blog which is a static hugo site. I wanted people to be able to contact me with inquiries and perhaps send me offers, but I definitely did not feel like making them log into disqus for this, so here is my solution.
Leveraging existing infrastructure.
My rule for this is: it should be short. I did this on an afternoon after work. It should require no new services in my server, I did not want to add more work to future me. It should not add new attack vectors to my server, the last thing I need is something else to worry about.
Turns out, I already have a line of communication with my site visitors, it’s called the access.log (yes, you know where this is going). My current web server (nginx) supports per route access.log and custom formatting for each.
Retrieving “User messages”
The first thing I needed was an easy way to go through my logs for a particular path, lets say I chose /getmessages
.
Rational formatting for logs
Ill needed my logs to be structured for this, I don’t feel like writing cthulu invocations in regexp just to get the message, so I took a sample from this blog and picked a format that would be useful for logstash (what is good for them, is good for me)
log_format logstash_json '{ "@timestamp": "$time_iso8601", '
'"@fields": { '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"body_bytes_sent": "$body_bytes_sent", '
'"request_time": "$request_time", '
'"status": "$status", '
'"request": "$request", '
'"request_method": "$request_method", '
'"http_referrer": "$http_referer", '
'"http_user_agent": "$http_user_agent" } }';
The output for these looks more or less like:
{
"@timestamp": "2018-02-01T18:23:48+00:00",
"@fields": {
"remote_addr": "xxx.xxx.xxx.xxx",
"remote_user": "-",
"body_bytes_sent": "220",
"request_time": "0.000",
"status": "404",
"request": "GET /getmessages/hello HTTP/2.0",
"request_method": "GET",
"http_referrer": "-",
"http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:58.0) Gecko/20100101 Firefox/58.0"
}
}
Seems comfortable enought, I could do without the particularities in the format but as I said, the less I work on the boring parts of this, the best, since it was a quick brainfart.
Now you only need to setup a some custom things in nginx like the access log for that particular route and a custom 404
that will show a thank you for the message to avoid confusions (you could get more creative with some javascript and perhaps redirect rules I guess, but this was just to play).
DISCLAIMER the custom 404 page is preventing logs so I am trying to determine what is going on.
[...]
location /getmessage {
access_log /var/log/nginx/your_access.json logstash_json;
error_page 404 /messages404.html;
}
[...]
Deciding Conventions.
Even if anything passing through there would be either a message or spam, I needed to decide how to know which message was for what.
I settled in /getmessage/<topic_code>?<messagefields>
so the url would contain a topic code an arbitrary string for what is worth that will tell me what post/subject was this about and the fields for the messages would be plain GET
encoded forms, this gives flexibility in terms of the data sent and uniformity to be able to filter on topics I don’t care.
Parsing and visualizing the logs.
For this I created a small go binary that I run with a cron every night, it can be found at the bottom.
Backstage
I have a few extra protections in place to avoid spamming and other malicious intents.
Reading.
This is a mailbox, I just use mutt.
Leave a comment :p
<form action="/messaging/blogpost-greetings" method="GET">
<div>
<label for="name">Name:</label>
<input type="text" name="name" id="name"> </input>
</div>
<div>
<label for="email">Email:</label>
<input type="email" name="email" id="email"> </input>
</div>
<div>
<label for="message">Message:</label>
<textarea type="textarea" name="message" id="message"> </textarea>
</div>
<div class="button">
<button type="submit">Talk to the nginx</button>
</div>
</form>