福建快三今天|福建快三开奖走势图表跨度
您的位置 > 首页 > 商业智能 > Top 5 Data Science GitHub Repositories and Reddit Discussions (February 2019)

Top 5 Data Science GitHub Repositories and Reddit Discussions (February 2019)

来源:分析大师 | 2019-03-04 | 发布:经管之家

I love GitHub. I have been a regular daily user of the various features the platform offers. That wasn’t always the case, however.I had vaguely heard about GitHub during my early data science learning days. The people I spoke to, even some of the influencers, espoused the value of GitHub as a code hosting / sharing / showcase platform. And since I was only just starting to learning R, I couldn’t really map the need of such a platform.How wrong I was! GitHub is a goldmine for data science professionals, regardless of whether you’re established or just starting out. GitHub will be of tremendous help irrespective of whether you are learning / following NLP, Computer Vision, GANs or any other data science development.I was truly won over once I realized all the big data science focused companies (Google, Facebook, Amazon, Uber, etc.) regularly open sourced their code on the platform.It is the best way to keep up with the breakneck developments happening in our field. You even get to download the code and replicate it on your own machine! What more could a data scientist ask for?In this article, we continue our monthly series of showcasing the best GitHub repositories and Reddit discussions from the month just gone by. February was a HUGE month in terms of open source data science libraries.Let’s get cracking!You should also check out our top GitHub and Reddit picks for January here:The above image seems like a typical collage – nothing to see here. What if I told you none of the people in this collection are real? That’s right – these folks do not exist.All these faces were produced by an algorithm called StyleGAN. While GANs have been getting steadily better since their invention a few years back, StyleGAN has taken the game up by several notches. The developers have proposed two new, automated methods to quantify the quality of these images and also open sourced a massive high-quality dataset of faces.This repository contains the official TensorFlow implementation of the algorithm. Below are a few key resources to learn more about StyleGAN:GPT-2 won the unofficial “most talked about” Natural Language Processing (NLP) library award in February. The way they went about launching GPT-2 raised quite a few eyebrows. The team claims that the model works so well they cannot fully open source it for fear of malicious use.You can imagine why that attracted headlines and questions. They have, however, released a smaller version of the model which is available on this GitHub repository we’ve linked above.GPT-2 is a largelanguage model with 1.5 billion parameters. The model has been trained on a datasetof 8 million web pages. The aim behind the model is to predict the next word, given all the previous words within some text. Is it state-of-the-art? We’ll have to take OpenAI’s word for it (for now).Here are a couple of additional resources to learn more about GPT-2:Another GAN library?! That’s right – GANs are taking the data science world by storm. SC-FEGAN is as cool in terms of style as the StyleGAN algorithm we covered above.The above image perfectly illustrates what SC-FEGAN does. You can edit all sorts of facial images using the deep neural network the developers have trained. We can all become artists just sitting in front of our computers!The repository helpfully includes steps to help you build the SC-FEGAN model on your own machine. Give it a try! And if computational power is a challenge, hop over to Google Colaboratory and utilize their free GPU offering.The premise behind LazyNLP is simple – it enables you to crawl, clean up and deduplicate websites to create massive monolingual datasets.What do I mean by massive? According to the developer, LazyNLP will allow you to create datasets larger than the one used by OpenAI for training the GPT-2 model. The full scale one. That certainly had my full attention.This GitHub repository lists down the 5 steps you’ll need to follow to create your own custom NLP dataset. If you’re in any way interested in NLP, you should definitely check out this release.

看图学经济more

院校点评more

京ICP备11001960号  京ICP证090565号 京公网安备1101084107号 论坛法律顾问:王进律师知识产权保护声明免责及隐私声明   主办单位:人大经济论坛 版权所有
联系QQ:2881989700  邮箱:[email protected]
合作咨询电话:(010)62719935 广告合作电话:13661292478(刘老师)

投诉电话:(010)68466864 不良信息处理电话:(010)68466864
福建快三今天 大乐透近3000期开奖号 金牛娱乐城存款 香港买马开奖结结果果 北京赛计划 黑龙江快乐十分体彩走势图 爱彩乐可靠吗 时时彩qq群 重庆时时app大全 香港码报 推荐公式规律3