Aller au contenu
AFUP AFUP Day 2025 Baromètre Planète PHP PUFA
 

Automate your Workflow: Removing Tedium in Everyday Work

Description

As the only editor of a highly active channel, I often read high quality content from a wide array of people. However, it's not all fun and games - I also need to keep track of author activity, do social promotion of posts, monitor payouts, keep an eye on the competition, and much more. These tasks take up a lot of precious time, and can be automated in a large part.

In this talk, I'll go into detail on how I removed the tedium from a lot of my work with Diffbot, PHP, and a set of interesting PHP libraries, automating author analytics and backstabbing detection, link generation, and more.

Conférence donnée lors du Forum 2014, ayant eu lieu les 23 et 24 octobre 2014.

Informations complémentaires

Vidéo

Le speaker

Bruno ŠKVORC

Web Developer from Croatia, managing editor for SitePoint's PHP channel and developer advocate for Diffbot. I keep the fat away by work-walking 4000km (http://4kk.me) and I rant on http://bitfalls.com. Active on Twitter (@bitfalls) and Google+ (+BrunoSkvorc).

Commentaires

Don't understand where is AI of diffbot. Chrome devtools + xslt
Cédric Lécuret, le 23/10/2014
Interesting talk.
Romain Gautier, le 23/10/2014
Not sure to understand all about Diffbot, but really appreciate the approach of automating repetitive and tedious task. And I was glad to see the PHP editor for SitePoint, a good place to find articles on PHP and other web stuff.
André Vignaud, le 23/10/2014
Well, while diffbot seems a nice piece of software, I didn't see the link with php or event development workflow...
rodrigue, le 23/10/2014
I'm not sure to understand all about Diffbot but i'll definitely take a look at it, for generating API easily based on urls. The only (big) drawback is that it seems to be a commercial product (and not the cheaper one). This is understandable, but is the Forum Php made for that ? Don't really understand where is the "machine learning part"
Dits kenny, le 24/10/2014
@all: Thank you very much for the feedback, it's very important to me so I can see which areas I need to work on, and what explanations needs to be made more approachable. I'll try to address each comment individually now. @Cedric: The magic of Diffbot is that it uses visual extraction of data - it doesn't rely only on HTML markup, but tries to learn and recognize what's important to humans. It gets better over time, so much so that even if the markup changes (essentially breaking the API in competitors), Diffbot will know to adapt to these changes and keep everything working. @Romain: Thanks! @Andre: I would be happy to clarify more? Is there anything in particular that confuses you? @Rodrigue: Not sure what you mean, could you clarify? The link to diffbot is diffbot.com or @diffbot on Twitter, if that's what you mean, and the slides and source code links are coming soon, as soon as I regain proper internet access at home. @Dits: See my comment to Cedric above, that explains a part of the machine learning. Why do you think it's a big drawback that it's commercial? You can have a free god token for two weeks, and you can also apply for a permanently free token if you have an open source or educational project. Get in touch with me if you'd like to know more about that. It's also important to note that Diffbot also has a database, and a mega-crawler called Crawlbot (more on that on another occasion). You can unleash it on an entire domain and it will return the entire harvested content. Not only that, but it will also ignore landing pages and other trivial and unimportant pages, recognizing them from previous experience, making sure you only get back what you really need. What's more, this data is then saved in both JSON and markup format on Diffbot's servers, and you can fast-search it Elasticsearch-style after it's done, without having to re-crawl. You're essentially getting an entire database backend. This is why it's a commercial product - it needs to cover these costs somehow. @all: I'm sorry if I focused on Diffbot too much in my talk - it was adapted from a workshop format, and the other aspects of my automation (which I didn't have the chance to cover here), use different tools. I picked the first aspect because it was most diverse (using the widest array of different technologies), but I could have just as easily skipped Diffbot and talked about something else (for example, inovice generation uses Swiftmailer, HTML2PDF, Gearman, Symfony's EventDispatcher, etc). Stay tuned for the code, I'll post the link here, it should be very interesting to everyone who attended the talk. Once again, thanks for the feedback!
Bruno Škvorc, le 24/10/2014
Nice talk, could have been better if not so formal. Good insight of how you deal with your job as an editor for a very good tech site.
Nelson da Costa, le 25/10/2014
Thanks for the presentation, I must admit I came just by reading the author and the topic as I'm a regular reader of SitePoint - that contains very interesting articles. The topic sounded good also, however, I did not read that it was almost all about Diffbot! I still appreciated the talk in a perfect well understandable English because my company has a similar usage for a quite different topic (checking product prices) but I must admit that, as @Dits, I think it's a bit expensive unfortunately, there's no chance we can use it. And regarding what said @Rodrigue, I think he wanted to know why was it presented in a PHP conference. Probably because it's developed in PHP? But I must admit I do not remember seeing any PHP there.
Guillaume Jariot, le 25/10/2014
Thank for this product presentation.
Kévin Ziemianski, le 25/10/2014
@nelson Thanks! What do you mean by formal exactly? @guoillaume Thanks for the feedback! I'm sorry if it felt too Diffbot focused, and the reason it was in the presentation is twofold: 1. I'm genuinely curious about what people can build with it and 2. while it isn't built in PHP, it is perfect for combining it with it due to the simplicity of libraries like Guzzle. It removed a huge part of my work after I combined it with PHP, and that was the gist of it. I showed you some PHP in the shape of Guzzle and Laravel implementation, but this wasn't supposed to be an in depth tutorial. Stay tuned for slides, more code will follow. I'll definitely try and go more in depth with code next time I talk about this though, thanks for letting me know.
Bruno Škvorc, le 25/10/2014
Maybe your talk would have been more appropriate if it was more about your job as an editor (choice of subjects, things like that) for a very good tech site
Alexis Janvier, le 26/10/2014
@alexis: hmm, interesting, thanks. Could you go into more specifics on what you'd like to hear about that?
Bruno Škvorc, le 26/10/2014
This talk provided us with an amazing glance at the current state of automation made available to users. I wish there would be even more liberation of the server code though.
Thierry Marianne, le 27/10/2014
Thank you, Thierry! Much appreciated
Bruno Škvorc, le 28/10/2014