tensorflow-cuda 1.2.1-2 x86_64   
Library for computation using data flow graphs for scalable machine learning
          python-tensorflow 1.2.1-2 x86_64   
Library for computation using data flow graphs for scalable machine learning
          tensorflow 1.2.1-2 x86_64   
Library for computation using data flow graphs for scalable machine learning
          python-tensorflow-cuda 1.2.1-2 x86_64   
Library for computation using data flow graphs for scalable machine learning
          Автоэнкодеры в Keras, Часть 5: GAN(Generative Adversarial Networks) и tensorflow   

Содержание



(Из-за вчерашнего бага с перезалитыми картинками на хабрасторейдж, случившегося не по моей вине, вчера был вынужден убрать эту статью сразу после публикации. Выкладываю заново.)

При всех преимуществах вариационных автоэнкодеров VAE, которыми мы занимались в предыдущих постах, они обладают одним существенным недостатком: из-за плохого способа сравнения оригинальных и восстановленных объектов, сгенерированные ими объекты хоть и похожи на объекты из обучающей выборки, но легко от них отличимы (например, размыты).

Этот недостаток в куда меньшей степени проявляется у другого подхода, а именно у генеративных состязающихся сетейGAN’ов.

Формально GAN’ы, конечно, не относятся к автоэнкодерам, однако между ними и вариационными автоэнкодерами есть сходства, они также пригодятся для следующей части. Так что не будет лишним с ними тоже познакомиться.

Коротко о GAN


GAN’ы впервые были предложены в статье [1, Generative Adversarial Nets, Goodfellow et al, 2014] и сейчас очень активно исследуются. Наиболее state-of-the-art генеративные модели так или иначе используют adversarial.

Схема GAN:



Читать дальше →
          Machine Learning Will Be A Vehicle For Many Heists In The Future   

I am spending some cycles on my algorithmic rotoscope work. Which is basically a stationary exercise bicycle for my learning about what is, and what is not machine learning. I am using it to help me understand and tell stories about machine learning by creating images using machine learning that I can use in my machine learning storytelling. Picture a bunch of machine learning gears all working together to help make sense of what I'm doing, and WTF I am talking about.

As I'm writing a story on how image style transfer machine learning could be put to use by libraries, museums, and collection curators, I'm reminded of what a con machine learning will be in the future, and be a vehicle for the extraction of value and outright theft. My image style transfer work is just one tiny slice of this pie. I am browsing through the art collections of museums, finding images that have meaning and value, and then I'm firing up an AWS instance that costs me $1 per hour to run, pointing it at this image, and extracting the style, text, color, and other characteristics. I take what I extracted from a machine learning training session, and package up into a machine learning model, that I can use in a variety of algorithmic objectives I have.

I didn't learn anything about the work of art. I basically took a copy of its likeness and features. Kind of like the old Indian chief would say to the photographer in the 19th century when they'd take his photo. I'm not just taking a digital copy of this image. I'm taking a digital copy of the essence of this image. Now I can take this essence and apply in an Instagram-like application, transferring the essence of the image to any new photo the end-user desires. Is this theft? Do I own the owner of the image anything? I'm guessing it depends on the licensing of the image I used in the image style transfer model--which is why I tend to use openly license photos. I'll have to learn more about copyright and see if there are any algorithmic machine learning precedents to be had.

My theft example in this story is just low-level algorithmic art essence theft. However, this same approach will play out across all sectors. A company will approach another company telling them they have this amazing machine learning voodoo, and if we run against your data, content, and media, it will tell you exactly what you need to know, give you the awareness of a deity. Oh, and thank you for giving me access to all your data, content, and media, it has significantly increased the value of my machine learning models--something that might not be expressed in our business agreement. This type of business model is above your pay grade, and operating on a different plane of existence.

Machine learning has a number of valuable use, with some very interesting advancements having been made in recent years, notably around Tensorflow. Machine learning doesn't have me concerned. It is the investment behind machine learning, and the less than ethical approaches behind some machine learning companies I am watching, and their tendencies towards making wild claims about what machine learning can do. Machine learning will be the trojan horse for this latest wave of artificial intelligence snake oil salesman. All I am saying is, that you should be thoughtful about what machine learning solutions you connect to your backend, and when possible make sure you are just connecting them to a sandboxed, special version of your world that won't actually do any damage when things go south.


          FS#54652: [python-tensorflow-cuda] ModuleNotFoundError upon importing tensorflow   
Package version: python-tensorflow-cuda 1.2.1-1


Steps to reproduce:
```
[omtcyfz@omtcyfz-arch ~]$ python
Python 3.6.1 (default, Mar 27 2017, 00:27:06)
[GCC 6.3.1 20170306] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in
from tensorflow.python import *
File "/usr/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 63, in
from tensorflow.python.framework.framework_lib import *
File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/framework_lib.py", line 100, in
from tensorflow.python.framework.subscribe import subscribe
File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/subscribe.py", line 26, in
from tensorflow.python.ops import variables
File "/usr/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 26, in
from tensorflow.python.ops import control_flow_ops
File "/usr/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 70, in
from tensorflow.python.ops import tensor_array_ops
File "/usr/lib/python3.6/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 33, in
from tensorflow.python.util import tf_should_use
File "/usr/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 28, in
from backports import weakref # pylint: disable=g-bad-import-order
ModuleNotFoundError: No module named 'backports'
>>> quit()
```

Since 1.2.0 importing tensorflow produces the described error, which indicates `backports` Python package absence. I didn't find this package in Arch package index and it's certainly not in the [python-tensorflow-cuda] package requierments, but it seems like the solution would be simply installing that package along with python-tensorflow-cuda.
          FS#54646: [python-tensorflow-cuda] Error when importing tensorflow   
Description:
When trying to import tensorflow this happens:

>>> import tensorflow
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in
from tensorflow.python import *
File "/usr/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 52, in
from tensorflow.core.framework.graph_pb2 import *
File "/usr/lib/python3.6/site-packages/tensorflow/core/framework/graph_pb2.py", line 10, in
from google.protobuf import descriptor_pb2
File "/usr/lib/python3.6/site-packages/google/protobuf/descriptor_pb2.py", line 238, in
serialized_end=4762,
File "/usr/lib/python3.6/site-packages/google/protobuf/descriptor.py", line 599, in __new__
return _message.default_pool.FindEnumTypeByName(full_name)
KeyError: "Couldn't find enum google.protobuf.MethodOptions.IdempotencyLevel"


Additional info:
python-tensorflow-cuda 1.2.1-1
python-protobuf 3.3.1-1

Steps to reproduce:
Open python3 and type "import tensorflow"
          Bonsai Expands TensorFlow Support with Gears, Extending Functionality of AI Platform for Enterprises Building Industrial Applications   
Bonsai, provider of an AI platform that empowers enterprises to build and deploy intelligent systems, released Gears, a top feature requested by customers in the Bonsai Early Access Program. Gears further extends the value of Bonsai to data scientists, providing them with a tool to manage, deploy and scale previously developed machine learning models, including those built with TensorFlow, within the Bonsai Platform.
          Podcast FS Hebdo - 9 au 15 novembre   
De Google qui libère le code source de TensorFlow, en passant par un simple collyre contre la cataracte, Jupiter qui aurait expulsé une planète géante, ou encore les gaz à effet de serre qui dépassent de nouveau leurs records, découvrez l’actualité scientifique pour la semaine du 9 au 15 novembre.
          Data scientist munkakörbe keresünk munkatársat. | Feladatok: Interact with customers to underst...   
Data scientist munkakörbe keresünk munkatársat. | Feladatok: Interact with customers to understand their requirements and identify emerging opportunities. • Take part in high and detailed level solution design to propose solutions and translating them into functional and technical specifications. • Convert large volumes of structured and unstructured data using advanced analytical solutions into actionable insights and business value. • Work independently and provide guidance to less experienced colleagues/employees. • Participate in projects, closely work and collaborate effectively with onsite and offsite teams at different worldwide locations in Hungary/China/US while delivering and implementing solutions. • Continuously follow data scientist trends and related technology evolutions in order to develop knowledge base within team.. | Mit ajánlunk: To be a member of dynamically growing site and enthusiastic team. • Professional challenges and opportunities to work with prestigious multinational companies. • Competitive salary and further career opportunities. | Elvárások: Bachelor?s/Master?s Degree in Computer Science, Math, Applied Statistics or a related field. • At least 3 years of experience in modeling, segmentation, statistical analysis. • Demonstrated experience in Data Mining, Machine Learning, additionally Deep Learning Tensorflow or Natural Language Processing is an advantage. • Strong programming skills using Python, R, SQL and experience in algorithms. • Experience working on big data and related tools Hadoop, Spark • Open to improve his/her skills, competencies and learn new techniques and methodologies. • Strong analytical and problem solving skills to identify and resolve issues proactively • Ability to work and cooperate onsite and offsite teams located in different countries Hungary, China, US and time zones. • Strong verbal and written English communication skills • Ability to handle strict deadlines and multiple tasks. | További infó és jelentkezés itt: www.profession.hu/allas/1033284
          pip安装tensorflow出错怎么办   

          tensorflow_gpu-1.2.1-cp35-cp35m-win_amd64.whl文件怎么安装?   

          怎么样在python 3.6里安装tensorflow?   

          领跑市场第一背后,ofo也在升级蜕变   

作为最早面向大众提供服务的无桩共享单车,ofo具有单车成本低,使用简单方便的特点。ofo强大的单车投放能力是给人留下深刻印象的,城市中所到之处都能看到小黄车的影子,因此ofo始终能在快速扩张的过程中居于市场占有率第一的位置。

 

据最新艾瑞数据,今年5ofo的月度活跃用户数据为6272万,活跃用户月度环比增长53.3%。这些活跃用户在5月共使用ofo服务高达13.47亿次,这一数据超过了滴滴出行的11.7亿次,共享单车在使用频率方面首次超过了网约车,而这一突破是由ofo完成的。

 

如果说一家的报告还不能证明,目前来看,整个第三方数据行业给出的结论惊人的一直。艾瑞、易观、Trustdata、猎豹等,公认ofo占据领先地位。

 


今天,美国数据公司7 park data 发布中国共享单车行业报告。报告中说,ofo65%的市场份额领导中国共享单车市场发展。这个权威数据,更加证实了国内数据机构的结论。

 

一直以来,外界对ofo破损率有所关注,而ofo也一直在积极进行产品的升级换代,用技术手段来降低单车坏损率。小黄车从开始提供服务至今,针对不同使用场景推出过6款车型,从以机械锁为主,再到如今的智能锁迅速开始升级换代,整个升级换代的脚步是稳定而有序的,速度也较快。现在,北京上海等各个城市都能看到相当多的智能锁了。

 

单车的锁确实是决定坏损率的一个重要条件,不过对于ofo这种体量的公司来说,每一次在锁上的升级换代,都能很有效地将坏损率降下去。戴威在达沃斯上说,ofo的报修率已从开始的两成下降到5%。这个变化,普通用户也能感知得到。

 

虽然ofo还没有将智能锁全面应用到单车上(ofo有能力快速部署),但ofo在智能锁的研发方面一直没有放松。今年6月,ofo与中国电信和华为共同研发的NB-IoT“物联网智能锁”,开始部署到小黄车上,这是全球首款物联网智能锁,也是窄带物联网技术首次在移动场景中进行商用。(为什么是全球首款,是因为只有华为具备了芯片量产能力,其他公司都还属于测试阶段)。这种智能锁信号强,连接能力强,功耗低,非常适合共享单车场景,其中ofo对智能锁进行研发,电信提供网络支持,华为提供芯片。

 

小黄车的这一升级,对于降低车辆的坏损率有着非常大的作用,如果全面部署下去的话,也许街头就看不到什么损毁的小黄车了。但ofo研发这种智能锁的目的显然并不仅仅是降低车辆的坏损率,而是瞄准了未来的物联网产业。移动通信与物联网通信的差别非常大,前者也就开个锁,后者却能做更多的事情,例如记录用户骑行轨迹,获悉用户需求,合理调度车辆投放等,更多的服务可能会从这方面打开。

 

小黄车除了研发出物联网智能锁并进行部署之外,在后台的算法技术方面也有长足进步,卷积神经网络是ofo的基础算法,谷歌TensorFlow人工智能系统也被应用进来提升需求预测效率,实现智能调度。后台人工智能系统与算法的积累,将为前台设备的不断升级换代提供动力,两者共同促进。在ofo的想法中,物联网不是一蹴而就的,需要各种各样的条件,网络环境、算法、芯片、成本等,在这方面ofo不想实现大跃进。

 

据行业数据,2017年中国共享单车市场规模将达102.8亿元,增长率为735.8%,用户规模将达2.09亿人。ofo目前在城市覆盖率、月活跃用户、使用率、用户偏好,用户粘性等方面都占据领先位置,但在这么大的市场规模和增长空间面前,仍不免怀有更大的企图。ofo近期宣布将在年底之前投放2000万辆单车,其意图在于在洗牌之前切下整个市场最大的一块压倒性份额,而如何处理对快速获取份额有利的低门槛以及有碍观瞻的高坏损率之间的关系,就变得非常重要了,相信ofo会处理好这个问题的。

 

毕竟,从快速起量到边跑边升级,进入精细化运营阶段的ofo要做的是长远的经得住时间考验的生意。

 

一个百花齐放的市场是最好的,大家各有各的道理,各自有自己的目标和思路,没有谁是有错的。相信在不久的将来,共享单车会成为中国的骄傲。

 
          Voice Kit : un kit complet pour créer son assistant vocal !   

The MagPi a l’art de nous surprendre. Le magazine officiel Raspberry Pi et Google ont proposé en avril dernier un kit complet pour créer son propre assistant vocal à la Google Home, Amazon Echo ou (Microsoft) Invoke. Pour le prix du magazine, c’était super intéressant même si la Pi n’était pas incluse dedans. On ne peut pas tout avoir ! Prise en main.

Comme à chaque fois que The MagPi sort un kit ou une Pi Zero offerte, après quelques heures, il ne reste plus rien ! Il est question de proposer, à la vente, le Voice Kit, mais pour le moment aucune information précise n’est disponible. Ce kit est une interface utilisateur vocale (VUI) utilisant des API et services Cloud.

Contenu du kit

-        boîtier en carton à monter ;

-        carte Voice Hat ;

-        Microphone Voice Hat ;

-        Lot de câbles ;

-        bouton pour activer l’assistant ;

-        1 lot de header (à souder) ;

-        haut-parleur.

Pour pouvoir fonctionner, vous devez rajouter une Raspberry Pi , une connexion Internet (WiFi de préférence), un clavier et souris, écran avec câble HDMI et une carte SD pour le système. Sans oublier l’alimentation.

Le coeur du kit : Voice Hat

L’élément central du kit est le shield Voice Hat. Pour l’utiliser, il faut le mettre sur une carte Pi. L’opération est simple. Ne pas oublier de stabiliser le shield avec les supports en plastique livrés en standard.

Ce Voice Hat est taillé pour le Voice Kit pour connecter le microphone et le bouton ainsi que le haut-parleur. La carte propose aussi plusieurs GPIO pour y connecter des capteurs divers et variés.  On dispose de 6 servos capables de supporter des moteurs et des capteurs. Chaque servo (0 à 5) se compose de trois broches : pin (GPIO), 5V et le GND. On dispose aussi de 3 drivers permettant de connecter divers capteurs. À cela se rajoute la possibilité d’ajouter des capteurs en I2C et en SPI.

Attention au voltage et à la puissance délivrée par ces GPIO. Les headers sur les GPIO ne sont pas soudés. À vous de le faire.

Bref, il y a de quoi faire avec ce shield. Si par défaut, on crée uniquement un assistant vocal, vous pouvez rapidement étendre l’électronique et rajouter de nombreux capteurs qui permettront d’étendre l’usage de votre Voice Kit.

À cela se rajoute une petite carte, le Voice Hat Microphone. Il possède deux micros (gauche / droite). Il est connecté au shield en I2C. Il est théoriquement possible d’utiliser la microphone board indépendamment du Voice Hat.

10 minutes pour monter

Le montage proprement dit est très rapide. Le guide de montage est disponible en ligne :

https://aiyprojects.withgoogle.com/voice/

Le boîtier en carton permet de démarrer son IoT mais il a tendance à ne pas tenir les cartes. Pour notre part, nous avons trouvé et imprimé un boîtier. L’impression est longue (env. 10h), mais le résultat est très sympa.

N’oubliez pas de flasher votre carte SD avec l’image système. Il s’agit d’une variante de la Raspbian.

10 minutes pour lancer l’assistant

Qui dit assistant vocal dit objet connecté. Le Voice Kit est avant tout un objet connecté et la connexion Internet est donc indispensable. Notre kit DIY (Faire soi-même) utilise Google Assistant SDK.

Tout d’abord, nous démarrons notre assistant avec la carte SD flashée. On se connecte à Internet, en WiFi. On teste que l’audio fonctionne avec le fichier Check audio. On fait la même chose pour le WiFi avec Check WiFi. Si des problèmes surviennent : il faut vérifier le montage et voir si toutes les connexions sont bonnes.

Le plus long dans l’installation est la partie Google Cloud Platform et les connexions aux services Assistant. Il faut disposer d’un compte Google Cloud Platform et de l’activité de l’API (tout se fait depuis la console Cloud Platform). On crée une nouveau projet et on créer un client Oauth 2.0. Ainsi on accède aux credentials et au Oauth Client IDE. Ces éléments servent à authentifier le service et à accéder au service depuis son Voice Kit.

Quelques manipulations depuis le Shell et quelques commandes permettent de tout mettre en place. La dernière étape, si tout va bien, est d’activer (dans Activity Controls) les services de localisation, activité vocale, etc.

À noter que le guide d’installation en ligne est un peu différent de celui présent sur le magazine MagPi n°57.

Une fois toute l'installation terminée, l’assistant est prêt. Il faut maintenant lancer le Start dev terminal pour lancer l’assistant. Si cela ne se fait pas automatiquement (le bouton doit passer en vert), il faut taper src/main.py. Il lance manuellement l’assistant.

Il est possible de lancer automatiquement l’assistant en tant que service système (via un sudo systemctl enable) au démarrage (ce qui n’est pas fait par défaut).

L’assistant peut être activé de deux manières :

-        par le bouton (dit bouton d’arcade) ;

-        par un simple claquement de mains (1 seul claquement).

Il suffit de configurer le fichier voice-recognizer.ini et de modifier le trigger gpio / clap. À noter que par défaut, l’assistant est en anglais.

Les commandes

Les commandes par défaut sont limitées : hello, what time is it, tell me a job, gestion du volume. Ce sont avant tout des exemples d’usages du service Google Assistant avec le Voice Kit. Heureusement, il est possible d’étendre les commandes, avec ses propres commandes.

Le Voice Kit permet des commandes Google Assistant ou de simples commandes locales.

Les nouvelles actions peuvent très diverses : de la simple question – réponse locale à une utilisation de capteur via le GPIO.

Il faudra alors modifier le code du fichier action.py pour ajouter l’action voulue. Les commandes personnalisées sont dans la partie def_add_commands_just_for_cloud_speech

Par exemple :

-        pour une simple commande, on utilisera simple_command (‘question’, ‘réponse’) ;

-        on peut interagir avec les capteurs connectés à son code en utilisant actor.add_keywork(action, Gpiowrite) ;

-        on peut aussi interagir avec le système en faisant un shutdown ou un reboot via une commande de type SpeakShellCommandOutput.

Après chaque modification / rajout d’actions, mieux vaut faire un reboot du système. En général la latence est bonne, mais il arrive qu’elle se dégrade. La qualité d’écoute du kit n’est pas toujours optimale surtout si l’environnement est bruyant. Parfois aussi, le kit refuse de se connecter au service Google ou de reconnaître la moindre commande. Vérifiez la connexion et redémarrez. En général, cela suffit.

Exemple :

# =========================================

# Makers! Implement your own actions here.

# =========================================

import RPi.GPIO as GPIO

class GpioWrite(object):
    '''Write the given value to the given GPIO.'''

    def __init__(self, gpio, value):
        GPIO.setmode(GPIO.BCM)
        GPIO.setup(gpio, GPIO.OUT)
        self.gpio = gpio
        self.value = value 

    def run(self, command):
        GPIO.output(self.gpio, self.value) 

-> puis implémentation de l’action et sa réponse :

actor.add_keyword('light on', GpioWrite(4, True))
actor.add_keyword('light off', GpioWrite(4, False)) 

Un bon 15/20

Honnêtement, pour le prix du Voice Kit, quand on arrive à en trouver un à un tarif acceptable (ce qui est très difficile), cet IoT est vraiment bon. La qualité du Voice Hat surprend. La partie logicielle mériterait une meilleure intégration pour éliminer quelques étapes, mais n’oublions pas que nous sommes dans une approche maker / DIY et non dans un IoT tel que Google Home ou Amazon Echo. Surtout, rajouter des commandes est assez rapide même quand on s’interface avec des capteurs via les GPIO. C’est une excellente solution pour comprendre l’intelligence artificielle via un assistant vocal.

Ce kit est plus facile d’accès que le ReSpeaker de Seeed Studio. Oui il est plus puissant et offre une qualité sur la partie audio supérieure (via le microphone shield), mais le prix et le manque d’intégration sur la partie logicielle, ainsi qu’une documentation trop légère, a fini par nous convaincre.  

Vous pouvez aussi interfacer votre IoT à l’outil TensorFlow (Google).

Android Things

Le Voice Kit est aussi supporté par Android Things, la plateforme IoT de Google. La Developer Preview 3.1 inclut ce support par défaut. Le système supporte uniquement la Raspberry Pi 3.

Vous devez utiliser la dernière version d’Android Things et suivre les instructions d’installation et de configuration.

François Tonic

Catégorie actualité: 
Image actualité AMP: 

          R语言中不能进行深度学习?R语言   




R语言中不能进行深度学习?R语言
网络

R语言中不能进行深度学习?R语言
模型

R语言中不能进行深度学习?R语言
python

R语言中不能进行深度学习?R语言
神经网络

R语言中不能进行深度学习?R语言
深度学习

R语言中不能进行深度学习?R语言
Tensorflow



众所周知,R语言是统计分析最好用的语言。但在Keras和TensorFlow的帮助下,R语言也可以进行深度学习了。


在机器学习的语言的选择上,R和Python之间选择一直是一个有争议的话题。但随着深度学习的爆炸性增长,越来越多的人选择了Python,因为它有一个很大的深度学习库和框架,而R却没有(直到现在)。


但是我就是想使用R语言进入深度学习空间,所以我就从Python领域转入到了R领域,继续我的深度学习的研究了。这可能看起来几乎不可能的。但是今天这变成了可能。


随着Keras在R上的推出,R与Python的斗争回到了中心。Python慢慢成为了最流行的深度学习模型。但是,随着Keras库在R后端的发布,并且在后台还可以使用张力流(TensorFlow)(CPU和GPU兼容性),所以在深度学习领域,R将再次与Python打成平手。


下面我们将看到如何使用Tensorflow在R中安装Keras,并在RStudio的经典MNIST数据集上构建我们的第一个神经网络模型。


目录:


1.在后端安装带有张量的Keras。


2.使用Keras可以在R中构建不同类型的模型。


3.在R中使用MLP对MNIST手写数字进行分类。


4.将MNIST结果与Python中的等效代码进行比较。


5.结束笔记。


1.在后端安装带有TensorFlow的Keras。

在RStudio中安装Keras的步骤非常简单。只需按照以下步骤,您将很顺利的在R中创建您的第一个神经网络模型。


install.packages("devtools")

devtools::install_github("rstudio/keras")

上述步骤将从GitHub仓库加载keras库。现在是将keras加载到R并安装TensorFlow的时候了。


library(keras)


默认情况下,RStudio加载TensorFlow的CPU版本。使用以下命令下载TensorFlow的CPU版本。


install_tensorflow()


要为单个用户/桌面系统安装具有GPU支持的TensorFlow版本,请使用以下命令。


install_tensorflow(gpu=TRUE)

有关更多的用户安装,请参阅本安装指南。


现在我们在RStudio中安装了keras和TensorFlow,让我们在R中启动和构建我们的第一个神经网络来解决MNIST数据集


2.使用keras可以在R中构建的不同类型的模型


以下是使用Keras可以在R中构建的模型列表。


1.多层感知器


2.卷积神经网络


3.循环神经网络


4.Skip-Gram模型


5.使用预先训练的模型,如VGG16,RESNET等


6.微调预先训练的模型。


让我们开始构建一个非常简单的MLP模型,只需一个隐藏的层来尝试分类手写数字。


3.使用R中的MLP对MNIST手写数字进行分类

#loading keras library

library(keras)

#loading the keras inbuilt mnist dataset

data<-dataset_mnist()

#separating train and test file

train_x<-data$train$x

train_y<-data$train$y

test_x<-data$test$x

test_y<-data$test$y

rm(data)

# converting a 2D array into a 1D array for feeding into the MLP and normalising the matrix

train_x <- array(train_x, dim = c(dim(train_x)[1], prod(dim(train_x)[-1]))) / 255 test_x <- array(test_x, dim = c(dim(test_x)[1], prod(dim(test_x)[-1]))) / 255

#converting the target variable to once hot encoded vectors using keras inbuilt function

train_y<-to_categorical(train_y,10)

test_y<-to_categorical(test_y,10)

#defining a keras sequential model

model <- keras_model_sequential()

#defining the model with 1 input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer[10 neurons]

#i.e number of digits from 0 to 9

model %>%

layer_dense(units = 784, input_shape = 784) %>%

layer_dropout(rate=0.4)%>%

layer_activation(activation = 'relu') %>%

layer_dense(units = 10) %>%

layer_activation(activation = 'softmax')

#compiling the defined model with metric = accuracy and optimiser as adam.

model %>% compile(

loss = 'categorical_crossentropy',

optimizer = 'adam',

metrics = c('accuracy')

)

#fitting the model on the training dataset

model %>% fit(train_x, train_y, epochs = 100, batch_size = 128)

#Evaluating model on the cross validation dataset

loss_and_metrics <- model %>% evaluate(test_x, test_y, batch_size = 128)

上述代码的训练精度为99.14,验证准确率为96.89。代码在i5处理器上运行,运行时间为13.5秒,而在TITANx GPU上,验证精度为98.44,平均运行时间为2秒。


4.MLP使用keras–R VS Python

为了比较起见,我也在Python中实现了上述的MNIST问题。我觉得在keras-R和Python中应该没有任何区别,因为R中的keras创建了一个conda实例并在其中运行keras。你可以尝试运行一下下面等效的python代码。


#importing the required libraries for the MLP model

import keras

from keras.models import Sequential

import numpy as np

#loading the MNIST dataset from keras

from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

#reshaping the x_train, y_train, x_test and y_test to conform to MLP input and output dimensions

x_train=np.reshape(x_train,(x_train.shape[0],-1))/255 x_test=np.reshape(x_test,(x_test.shape[0],-1))/255

import pandas as pd

y_train=pd.get_dummies(y_train)

y_test=pd.get_dummies(y_test)

#performing one-hot encoding on target variables for train and test

y_train=np.array(y_train)

y_test=np.array(y_test)

#defining model with one input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer [10 #neurons]

model=Sequential()

from keras.layers import Dense

model.add(Dense(784, input_dim=784, activation='relu'))

keras.layers.core.Dropout(rate=0.4)

model.add(Dense(10,input_dim=784,activation='softmax'))

# compiling model using adam optimiser and accuracy as metric

model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])

# fitting model and performing validation

model.fit(x_train,y_train,epochs=50,batch_size=128,validation_data=(x_test,y_test))

上述模型在同一GPU上实现了98.42的验证精度。所以,我们最初猜到的结果是正确的。


5.结束笔记

如果这是你在R的第一个深度学习模型,我希望你喜欢它。通过一个非常简单的代码,您可以有98%位准确率对是否为手写数字进行分类。这应该是足够的动力让你开始深度学习。


如果您已经在Python中使用keras深度学习库,那么您将在R中找到keras库的语法和结构与Python中相似的地方。事实上,R中的keras包创建了一个conda环境,并安装了在该环境中运行keras所需的一切。但是,让我更为激动的是,现在看到数据科学家在R中建立现实生活中的深层次的学习模型。据说 – 竞争应该永远不会停止。我也想听听你对这一新发展观点的看法。你可以在下面留言分享你的看法。


欢迎加入本站公开兴趣群

商业智能与数据分析群

兴趣范围包括各种让数据产生价值的办法,实际应用案例分享与讨论,分析工具,ETL工具,数据仓库,数据挖掘工具,报表系统等全方位知识

QQ群:81035754


          Comment on How to Get Reproducible Results with Keras by Jason Brownlee   
No, I believe LSTM results are reproducible if the seed is tied down. Are you using a tensorflow backend? Is Keras, TF and your scipy stack up to date?
          Big Data Engineer   
Best Buy Canada Ltd. (Burnaby BC): "RSS feeds and more; automating and QAing the process. You have your ear to ground on emerging Big Data technologies: Google Cloud Platform, TensorFlow, IBM Watson your educational background?...."
          Tensorflow Programs and Tutorials   
This repository did some toy experiments based on Tensorflow in order to introduce some deep learning concepts which are used for image recognition and language modeling.
          TensorFlow: Google rilascia il suo software di intelligenza artificiale al mondo Open Source   

Google rilascia al mondo Open Source la sua piattaforma di Learning Machine (apprendimento automatico), che è al centro dell'intelligenza artificiale grazie alla quale le App sugli smartphone potranno presto svolgere funzioni fino ad oggi impossibili.

Autor: avatarbyoblu
Tags: Google Artificial intelligence Learning Machine Tensorflow Apprendimento automatico Intelligenza artificiale
gepostet: 10 November 2015


          Автоэнкодеры в Keras, Часть 5: GAN(Generative Adversarial Networks) и tensorflow   

Содержание



(Из-за вчерашнего бага с перезалитыми картинками на хабрасторейдж, случившегося не по моей вине, вчера был вынужден убрать эту статью сразу после публикации. Выкладываю заново.)

При всех преимуществах вариационных автоэнкодеров VAE, которыми мы занимались в предыдущих постах, они обладают одним существенным недостатком: из-за плохого способа сравнения оригинальных и восстановленных объектов, сгенерированные ими объекты хоть и похожи на объекты из обучающей выборки, но легко от них отличимы (например, размыты).

Этот недостаток в куда меньшей степени проявляется у другого подхода, а именно у генеративных состязающихся сетейGAN’ов.

Формально GAN’ы, конечно, не относятся к автоэнкодерам, однако между ними и вариационными автоэнкодерами есть сходства, они также пригодятся для следующей части. Так что не будет лишним с ними тоже познакомиться.

Коротко о GAN


GAN’ы впервые были предложены в статье [1, Generative Adversarial Nets, Goodfellow et al, 2014] и сейчас очень активно исследуются. Наиболее state-of-the-art генеративные модели так или иначе используют adversarial.

Схема GAN:



Читать дальше →
          云端TensorFlow:弹性GPU资源加速   
爱可可-爱生活   网页版 2017-06-30 06:15 经验总结 GPU 博客 【云端TensorFlo […]
          The rise of artificial intelligence: what does it mean for development?   

Video: Artificial intelligence and the SDGs (International Telecommunication Union)

Along with my colleagues on the ICT sector team of the World Bank, I firmly believe that ICTs can play a critical role in supporting development. But I am also aware that professionals on other sector teams may not necessarily share the same enthusiasm.

Typically, there are two arguments against ICTs for development. First, to properly reap the benefits of ICTs, countries need to be equipped with basic communication and other digital service delivery infrastructure, which remains a challenge for many of our low-income clients. Second, we need to be mindful of the growing divide between digital-ready groups vs. the rest of the population, and how it may exacerbate broader socio-economic inequality.

These concerns certainly apply to artificial intelligence (AI), which has recently re-emerged as an exciting frontier of technological innovation. In a nutshell, artificial intelligence is intelligence exhibited by machines. Unlike the several “AI winters” of the past decades, AI technologies really seem to be taking off this time. This may be promising news, but it challenges us to more clearly validate the vision of ICT for development, while incorporating the potential impact of AI.

It is probably too early to figure out whether AI will be blessing or a curse for international development… or perhaps this type of binary framing may not be the best approach. Rather than providing a definite answer, I’d like to share some thoughts on what AI means for ICT and development.

AI and the Vision of ICT for Development

Fundamentally, the vision of ICT for development is rooted in the idea that universal access to information is critical to development. That is why ICT projects at development finance institutions share the ultimate goal of driving down the cost of information. However, we have observed several notable features of the present information age: 1) there is a gigantic amount of data to analyze, which is growing at an unprecedented rate and 2) in the highly complex challenges of our world, it is almost impossible to discover structures in raw data that can be described as simple equations, for example when finding cures for cancer or predicting natural disasters.

This calls for a new powerful tool to convert unstructured information into actionable knowledge, which is expected to be greatly aided by artificial intelligence. For instance, machine learning, one of the fastest-evolving subfields in AI research, provides feature predictions with greatly enhanced accuracies at much lower costs. As an example, we can train a machine with a lot of pictures, so that it can later tell which photos have dogs in it or not, without a human’s prior algorithmic input.

To summarize, AI promises to achieve the vision of ICT for development much more effectively. Then, what are some practical areas of its usage?

AI for development: areas of application

Since AI research is rapidly progressing, it is challenging to get a clear sense of all the different ways AI could be applied to development work in the future; nonetheless, the following are a couple areas where current AI technologies are expected to provide significant added-value.

First, AI allows us to develop innovative new solutions to many complex problems faced by developing countries. As an example, a malaria test traditionally requires a well-trained medical professional who analyzes blood samples under a microscope. In Uganda, an experiment showed that real-time and high-accuracy malaria diagnoses are possible with machines running on low-powered devices such as Android phones.

Secondly, AI could make significant contributions to designing effective development policies by enabling accurate predictions at lower costs. One promising example is the case of the US-based startup called Descartes. The company uses satellite imagery and machine learning to make corn yield forecasts in the US. They use spectral information to measure chlorophyll levels of corn, which is then used to estimate corn production. Their projections have proven to be consistently more accurate than the survey-based estimates used by the US Department of Agriculture. This kind of revolution in prediction has great potential to help developing economies design more effective policies, including for mitigating the impact of natural disasters.
Weekly state and county-level corn prediction by Descartes lab

Looking forward – Toward the democratization of AI?

Many assume that it is too early to talk about AI in the developing world, but the mainstreaming of AI may happen sooner than most people would assume. Years ago, some tech visionaries already envisioned that AI would soon become a commodity like electricity. And this year, Google revealed TensorFlow Lite, the first software of its kind that runs machine learning models on individual smartphones. Further, Google is working on the AutoML project, an initiative to leverage machine learning to automate the process of designing machine learning models themselves.

As always, new technology can be liberating and disruptive, and the outcome will largely depend on our own ability to use it wisely. Despite the uncertainty, AI provides another exciting opportunity for the ICT sector to leverage technological innovation for the benefit of the world’s marginalized populations.
 
          Googleのお絵かきAIが学習して進化、線を引くとその先を予測して書き足す機能追加   
黒い部分だけを描いたら、後はAIが鳥を想像して描いてくれる「Multi-Prediction」 Googleは26日(米国時間)、人間の脳の神経回路をモデルとしたニューラルネットワークを採用したお絵かきAI「sketch-rnn」について、研究の成果を発表した。   Googleでは以前、テキストで出されたお題に対し、ユーザーが20秒以内に絵で回答を描いて人工知能に当てさせるという「Quick, Draw!」ゲームを実施。その際にユーザーが描いた何百万という画をAIに学習させることにより、ユーザーが描きかけの画でも、その完成図を予想して線を足していくことができるようになった。   現在公開されているデモでは、モデルとして数十種類のモチーフが用意されていて、それに即した描き足しが可能。例えば、モデルでface(顔)を選択して、輪郭のような線を描いて手を止めると、AIが自動で目や鼻を描いてくれる仕組みだ。 AIが顔を想像して描いたものAIが顔を想像して描いたもの ほかにも、複数の候補を表示するデモ「Multi-Prediction」も公開。左側で線を描き、右側に9つの候補が表示されるしくみで、例えばbird(鳥)を選択して、羽根のようなものを描くと、右側にはその線を生かしながら鳥のフォルムを描いてくれる。

また、2つの異なる絵をランダムに描く「Interpolation」や、ユーザーが描いたサンプルの絵をまねて描く「Variational Auto-Encoder」も公開されており、AIがお絵かきで人間に貢献する未来が見えてくるデモとなっている。   発表資料 URL:https://magenta.tensorflow.org/sketch-rnn-demo 2017/06/30
▷こちらもおすすめ
撮影品を一瞬で現金に変える質屋アプリCASHがリリース1日でダウン、査定を一時中止に   自分の名前の一部が動物に!ハンコ制作が動物保護にもつながるWWFの「WITH STAMP」   任天堂、人気の「ニンテンドースイッチ」について出荷量を増やすと約束。これまでの品薄については謝罪  
■関連記事
国民生活センター、コンビニ払いで仮想通貨購入用口座に入金させる手口に注意喚起 (2017.6.30)
マクドナルドがまた受難、「バーガーツイート診断」に偽装URL登場 (2017.6.30)
Instagram、迷惑・不適切コメントの非表示が可能に。スパムコメント自動ブロック機能も追加 (2017.6.30)




          Google pone como código abierto las herramientas de entrenamiento de TensorFlow   
Tensor2Tensor simplifica el entrenamiento en modelos de aprendizaje profundo para que los desarrolladores puedan crear más fácilmente flujos de trabajo de aprendizaje de máquina
          O’Reilly AI Conference News Roundup   

O'Reilly AI Conference news roundup: TensorFlow support, deep learning, and AI training data were some notable announcements.

The post O’Reilly AI Conference News Roundup appeared first on RTInsights.


          Hops and Tensorflow enabling rapid machine learning   


Distributed Tensorflow-as-a-Service now available on SICS ICE
Starting this June, we are offering Tensorflow as a managed service on the Hopsworks platform, hosted at the SICS ICE research datacenter facility in Luleå. 

hadoop, hops, data center, SICS ICE, big data, machine learning

RISE SICS
          Cloud Developer with Tensorflow Exp   

          oneNote alternatfi for linux (9 Mesajlar Yaz)   
2 gündür kanser etti win10, ne tensorflow adam gibi çalışıyor ne keras, multiprocessing yapamıyorum salak win10 ile tensorflow'da.

lanet olsun atom fiziğine diyip çakıcam centos'u da tek derdim onenote. Var mı şunun bir alternatifi? muthiş efektif bir app kendisi, direk ekrandan ss alıp koymaca, yazılanları taglamaca çokca kullandığım fonksiyonları.
          Comment on TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level ML Frameworks – KDD 2017 by sahil9821   
Vladhin he looks like he's scared that if he speaks something incorrectly his mum will spank him
          Comment on TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level ML Frameworks – KDD 2017 by sahil9821   
Kishore Shinobi Python
          Comment on TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level ML Frameworks – KDD 2017 by Vladhin   
This guy looks scared
          (转)在Windows上安装GPU版Tensorflow   

          从28303篇论文看机器学习领域的发展变化机器学习   




从28303篇论文看机器学习领域的发展变化机器学习
网络

从28303篇论文看机器学习领域的发展变化机器学习
算法

从28303篇论文看机器学习领域的发展变化机器学习
机器学习

从28303篇论文看机器学习领域的发展变化机器学习
框架

从28303篇论文看机器学习领域的发展变化机器学习
深度学习



OpenAI是由诸多硅谷大亨联合建立的人工智能非盈利组织,目的是预防人工智能的灾难性影响,促使人工智能发挥积极作用。本文由OpenAI的研究人员Andrej Karpathy撰写,主要陈述了他通过分析机器学习论文数据库arxiv-sanity里面的28303篇论文里面的高频关键词所发现的有趣的结论。


你是否用过谷歌趋势(Google Trends)(https://trends.google.com/trends/?cat=)呢?它的功能很酷:只需要输入关键词,就可以看到该词的搜索量随时间变化的情况。这个产品在一定程度上启发了我,恰巧我有在过去五年中发表在(arxiv)机器学习论文数据库(http://arxiv-sanity.com/)上的28303篇论文,所以我想,为什么不研究一下该领域发展变化的情况呢?研究结果相当有趣,所以我决定跟大家分享一下。

(注:机器学习是一个包罗万象的领域,本文中相当长的篇幅是对深度学习领域的研究,这也是我最为熟悉的领域)


arxiv的奇点

让我们先来看看提交到arxiv-sanity的所有分类(cs.AI, cs.LG, cs.CV, cs.CL, cs.NE, stat.ML)下的论文总数随时间变化的趋势,如下图所示:


从28303篇论文看机器学习领域的发展变化机器学习


没错,峰值位于2017年3月,这个月这些领域有近2000篇论文提交。这一峰值很可能是某些会议的截稿日期(例如NIPS/ICML)造成的。由于并不是所有人都会将他们的论文上传至arxiv,而且上传比例也在随时间变化而变化,所提交的论文数量并不能完全体现机器学习这一领域的研究规模。不过可以看到,有大量的论文为人所注意、浏览或者阅读。


接下来,我们用这一数字作为分母,看看多少文章包含我们感兴趣的关键词。


深度学习框架

首先,我们关心的是深度学习框架的使用情况。如果在文中任何地方有提到深度学习框架,包括参考书目,都会被记录在案。下图是在2017年3月提交的论文中提到深度学习框架的情况:


从28303篇论文看机器学习领域的发展变化机器学习

可见2017年3月提交的论文中有约10%提到了TensorFlow。当然不是每篇文章都会写出他们所用的框架,不过如果我们假定提及与否和框架类型无关(即说明框架的文章有相对确定的使用比例)的话,可以推断出该社区大约有40%的用户正在使用TensorFlow(如果算上带TensorFlow后端的Keras框架,数量会更多)。下图是一些常用框架随时间变化的趋势图:


从28303篇论文看机器学习领域的发展变化机器学习

我们可以看到,Theano在很长时间占据主流,后来不再流行;2014年Caffe的发展势头强劲,不过在最近几个月内被TensorFlow取代;Torch(和最近的PyTorch)同样在缓慢稳步发展。它们未来发展的状况会是怎样呢?这是一个有趣的话题,个人认为Caffe和Theano会继续下降,TensorFlow的发展速度则会因为PyTorch的竞争而放缓。


ConvNet模型

常用的ConvNet模型的使用情况又是怎样呢?我们可以在下图看到,ResNets模型异军突起,该模型出现在去年3月发表的9%的论文中。



从28303篇论文看机器学习领域的发展变化机器学习

另外,我很好奇在InceptionNet出现之前有谁在讨论inception呢?


优化算法

优化算法方面,Adam一枝独秀,在所有论文中的出现率高达23%!其真正的使用率很难统计,估计会比23%更高,因为很多论文并没有写出他们所使用的优化算法,况且很多关于神经网络的研究并不使用任何此类算法。然而也有可能要下调5%,因为这个词也非常可能是指代作者的名字,而Adam优化算法在2014年12月才被提出。


从28303篇论文看机器学习领域的发展变化机器学习

研究者

我关注的另一指标是论文中提及深度学习领域的研究专家的次数(这与引用次数有些类似,但是前者能更好的用0/1指标表达,且能根据文章总数进行标准化):



从28303篇论文看机器学习领域的发展变化机器学习

需要注意的是:35%的文章提到了“bengio”,但是学界有两个叫Bengio的专家,分别是Samy Bengio和Yoshua Bengio,图中显示的是两者总和。特别地,Geoff Hinton在30%的最新论文中也被提到,这是一个很高的比例。


关键词研究

最后,本文没有针对关键词进行手动分类,而是关注了论文中最热门和最不热门的关键词 。


最热门关键词

定义最热关键词的方法有很多,本文使用的方法如下:对于在所有论文中出现的一元分词和二元分词,分别计算出去年和去年以前该词的使用次数,并用二者相除得到的比例做排名。排名靠前的关键词是那些一年前影响有限、但是最近一年出现频率极高的词汇,如下表所示(该表是删除重复词以后的结果):


从28303篇论文看机器学习领域的发展变化机器学习

举例来说,ResNet的比例是8.17,该词在一年之前(2016年3月)只在1.044%的论文中出现,但上个月8.53%的论文中都有这个关键词,所以我们有8.53 / 1.044 ~= 8.17的比例。


所以可以看到,在过去一年流行起来的核心技术有:1) ResNets, 2) GANs, 3) Adam, 4) 批规范化(BatchNorm)。


关于研究方向,最火的关键词分别是1)风格转换(Style Transfer), 2) 深度强化学习, 3) 神经网络机器翻译(“nmt”),或许还有 4)图像生成。


整体构架方面,最流行的是1) 全卷积网络(FCN), 2) LSTMs/GRUs, 3) Siamese网络, 和4) 编码-解码器网络。


从28303篇论文看机器学习领域的发展变化机器学习

最“过时”关键词

相反的,过去一年不再流行的关键词有哪些呢?如下表所示:



从28303篇论文看机器学习领域的发展变化机器学习

我并不确定“fractal”的含义,不过大体上看,贝叶斯非参数统计似乎不那么流行了。

结论

所以,是时候提交应用全卷积网络、编码-解码器、批规范化、ResNet、Gan来做风格转换,用Adam来优化你的论文了。嘿,这听起来也不是很离谱嘛:)


欢迎加入本站公开兴趣群

商业智能与数据分析群

兴趣范围包括各种让数据产生价值的办法,实际应用案例分享与讨论,分析工具,ETL工具,数据仓库,数据挖掘工具,报表系统等全方位知识

QQ群:81035754


          Google lança MobileNets, coleção de modelos de visão por computador   
O Google lançou o MobileNets, uma coleção de modelos de visão por computador para o TensorFlow que funcionam inteiramente nos dispositivos móveis. Os dispositivos móveis têm acesso a muitas dessas tecnologias de visão por computador por meio da nuvem. Agora, com MobileNets, os dispositivos móveis podem classificar e detectar diretamente objetos vistos usando a câmera. […]
          Presentation: In Depth TensorFlow   

Illia Polosukhin keynotes on TensorFlow, introducing it and presenting the components and concepts it is built upon.

By Illia Polosukhin
          TensorFlow: resoconto secondo incontro Meetup "Machine-Learning e Data Science" Roma (19 febbraio 2017)   

170219-tensorflow.jpg

Il 16 febbraio 2017 si è svolto a Roma – presso il Talent Garden di Cinecittà - il secondo incontro del Meetup "Machine Learning e Data Science" (web, fb, slideshare): l’incontro - organizzato insieme al Google Developer Group Roma Lazio Abruzzo   - è stato dedicato alla presentazione di TensorFlow,  la soluzione di Google per il deep learning nel machine learning.

Nel seguito una breve sintesi dell’incontro del 16 febbraio 2017.

Prima di iniziare:

•    Cos’è il Machine Learning? Intervista a Simone Scardapane sul Machine Learning Lab (1 dicembre 2016)
•    Cos’è TensorFlow? Andrea Bessi, “TensorFlow CodeLab”, Nov 16, 2016

Resoconto

Premessa

Il 15 Febbraio 2017 a Mountain View (USA) si è tenuto il “TensorFlow Dev Summit” , il primo evento ufficiale di Google dedicato a TensorFlow, la soluzione di deep learning rilasciato circa un anno fa e giunta adesso alla versione 1.0 “production ready”.
Il “TensorFlow Dev Summit” è stato aperto da un keynote di Jeff Dean (wiki, web, lk) - Google Senior Fellow - Rajat Monga (tw) - TensorFlow leader nel Google Brain team - e Megan Kacholia (lk) Engineering Director del TensorFlow/Brain team.

Per approfonsire il #TFDevSummit 2017:

L’incontro del Meetup "Machine-Learning” di Roma è stata un’occasione per rivedere insieme e commentare il video e fare anche una breve presentazione di TensorFlow.
Alla fine dell'evento c’è stato un piccolo rinfresco aperto a tutti gli appassionati di deep learning a Roma.

Simone Scardapane: “TensorFlow and Google, one year of exciting breakthroughs”

Simone (mup, web, fb, lk) ha aperto l’incontro con una breve presentazione sul deep learning e TensorFlow.
“TensorFlow” ha detto Simone “ ha reso accessibili a tutti le reti neurali che sono alla base del deep learning e dell’intelligenza artificiale. Prima di TensorFlow c’era una oggettiva difficoltà nel gestire e allenare reti neurali vaste e complesse”.
Le moderne architetture di deep learning usano infatti reti neurali molto complesse: ad esempio nelle applicazioni di data imaging  i modelli architetturali prevedono decine di milioni di parametri.

170219-reteneurale.jpg

(Rete neurale, immagine tratta da http://joelouismarino.github.io/blog_posts/blog_googlenet_keras.html)

Uno dei grossi vantaggi di TensorFlow è che permette di definire una rete neurale in modo simbolico: lo strumento fornisce inoltre un compilatore efficiente che gestisce in automatico il processo di back-propagation.
TensorFlow può essere utilizzato inoltre con una interfaccia semplificata come Keras arrivando quasi ad una sorta di programmazione dichiarativa.
La prima release di TensorFlow – ha ricordato Simone - è stata rilasciata nel novembre 2015 con licenza aperta Apache 2.0  (cos’è?). Il 15 febbraio 2017 – durante il TFDevSummit - è stata annunciata la versione 1.0 di TensorFlow la prima “production ready”.
La disponibilità di un ambiente deep learning aperto e “user-friendly” ha permesso lo sviluppo di una vasta comunità di esperti, ricercatori e semplici appassionati e il rilascio applicazioni software di grande impatto. Simone ha mostrato alcuni esempi.

1) Neural image captioning: software in grado di riconoscere e descrivere o sottotitolare immagini.

170216-tensorflow-s2.jpg
 

170216-ragazzo-aquilone.jpg

170216-tren-b-n.jpg

2) Google Neural Machine Translation (GNMT)  che ha permesso il rifacimento di “Google translator” grazie al deep learning: invece di tradurre parola per parola ora è possibile analizza il testo nella sua interezza cogliendo significato e il contesto con un livello di accuratezza ormai vicino alla traduzione umana.

nmt-model-fast.jpg

170216-neural-translation.jpg
170216-neural-translation2.jpg

3) Generative Adversarial Networks (GANS) sistemi capaci di generare nuovi dati grazie a un emenorme “training set” e che lavorano con una coppia di reti neurali: la prima produce nuovi dati la seconda controlla la “bontà” del risultato; questi sistemi sono già stati usati per generare immagini artificiali, scenari per video-game, migliorare immagini e riprese video di scarsa qualità.

170216-generative-image

4) Alphago: il deep learning è anche alla base dei recenti spettacolari successi dell’IA nel campo dei giochi da tavolo come la vittoria di Alphago di Google  contro il campione del mondo di GO.

170216-alphago.jpg

5) WaveNet - a generative model for raw audio - capace di generare discorsi che imitano una voce umana con molta più “naturalezza” rispetto ai migliori sistemi di Text-to-Speech oggi esistenti. WaveNet è già stato utilizzato anche per creare musica artificiale.

Simone ha concluso il suo intervento ricordando che di deep learning e ML si parlerà anche in un track specifico alla Data Driven Innovation Roma 2017   che si terrà presso la 3° università di Roma il 24 e 25 febbraio 2017.

Sintesi del keynote di Jeff Dean, Rajat Monga e Megan Kacholia su TensorFlow

Il keynote di apertura del TF DevSummit 2017 condotto da Jeff Dean, Rajat Monga e Megan Kacholia  ha trattato:

  • origini e storia di TensorFlow
  • i progressi da quanto è stata rilasciata la prima versione opensource di TensorFlow
  • la crescente comunità open-source di TensorFlow
  • performance e scalabilityà di TensorFlow
  • applicazioni di TensorFlow
  • exciting announcements!

Jeff Dean

jeff dean.jpg

Jeff ha detto che l’obiettivo di Google con TensorFlow è costruire una “machine learning platform” utilizzabile da chiunque.
TensorFlow è stato rilasciato circa un anno fa ma le attività di Google nel campo del machine learning e deep learning sono iniziati 5 anni fa.
Il primo sistema realizzato – nel 2012 – è stato DistBelief un sistema proprietario di reti neurali adatto ad un ambiente produzione come quello di Google basato su sistemi distribuiti (vedi “Large Scale Distributed Deep Networks” in pdf). DistBelief è stato utilizzato in molti prodotti Google di successo come Google Search, Google Voice Search, advertising, Google Photos, Google Maps, Google Street View, Google Translate, YouTube.
Ma DistBelief aveva molti limiti: “volevamo un sistema molto più flessibile e general purpose” ha detto Jeff “che fosse open source e che venisse adottato e sviluppato da una vasta comunità in tutto il mondo e orientato non solo alla produzione ma anche alla ricerca. Così nel 2016 abbiamo annunciato TensorFlow, una soluzione capace di girare in molteplici ambienti compreso IOS, Android, Raspberry PI, capace di girare su CPU, GPU e TPU, ma anche sul Cloud di Goole ed essere interfacciata da linguaggi come Python, C++, GO, Hasknell, R”.

170216-tensorflow-keynote1.jpg
TensorFlow ha anche sofisticati tool per la visualizzazione dei dati e questo ha facilitato lo sviluppo di una vasta comunità open source intorno a TensorFlow.

170216-tensorflow-keynote2.jpg

Rajat Monga

rajatmonga2.jpg

Rajat Monga ha ufficialmente annunciato il rilascio della versione 1.0 di TensorFlow illustrandone le nuove caratteristiche.

170216-tensorflow-keynote3.jpg

170216-tensorflow-keynote4.jpg

Rajat ha poi illustrato le nuove API di TensorFlow

170218-tf-model.jpg

TensorFlow 1.0 supporta IBM's PowerAI distribution, Movidius Myriad 2 accelerator, Qualcomm SnapDragon Hexagon DSP. Rajat ha annunciato anche la disponibilità di XLA an experimental TensorFlow compiler  specializzato nella compilazione just-in-time e nei calcoli algebrici.

170218-tf-xla.jpg

Megan Kacholia

megankacholia2.jpg

Megan Kacholia ha approfondito il tema delle performancedi TensorFlow 1.0.

170216-tensorflow-keynote7.jpg

In ambiente di produzione si possono utilizzare  molteplici architetture: server farm, CPU-GPU-TPU, server a bassa latenza (come nel mobile) perché TensorFlow 1.0 è ottimizzato per garantire ottime performance in tutti gli ambienti.

170218-tf-performance1.jpg

Megan ha poi illustrato esempi dell’uso di TensorFlow in ricerche d'avanguardia - cutting-edge research – e applicazioni pratiche in ambiente mobile.

170218-tf-research2.jpg

170218-tf-research1.jpg

Conclusione del keynote

In conclusione del Keynote è di nuovo intervenuto Jeff per ringraziare tutti coloro che contribuiscono alla comunità di TensorFlow pur non facendo parte di Google.
“Dalla comunità” ha detto Jeff “arrivano suggerimenti, richieste e anche soluzioni brillanti a cui in Google non avevamo ancora pensato” citando il caso di un agricoltore giapponese che ha sviluppato un’applicazione con TensorFlow su Raspberry PI per riconoscere i cetrioli storti e scartarli nella fase di impacchettamento.
Nel campo della medicina – ha ricordato Jeff – TensorFlow è stato usato per la diagnostica della retinopatia diabetica (qui una sintesi) e all’università di Stanford per la cura del cancro della pelle .

Contatti

•    Meetup “Machine-Learning e Data Science” di Roma: - sito web e pagina Facebook

Approfondimenti

Video

Ulteriori informazioni su TensorFlow

Leggi anche

AG-Vocabolario: 

          Resoconto primo incontro Meetup "Machine-Learning e Data Science" (3 febbraio 2017)   

170203-ml-meetup.jpg

Il 2 febbraio 2017 si è svolto a Roma il primo incontro del Meetup "Machine-Learning e Data-Science"  (fb) presso la Sala Storica di LUISS ENLABS.

Agenda

  • Simone Scardapane e Gabriele Nocco, presentazione del Meetup "Machine-Learning e Data Science"
  • Gianluca Mauro (AI-Academy): Intelligenza artificiale per il tuo business
  • Lorenzo Ridi (Noovle): Serverless Data Architecture at scale on Google Cloud Platform

Simone Scardapane e Gabriele Nocco, presentazione del Meetup "Machine-Learning e Data Science"

161022-simone-scardapane.jpg

(Simone Scardapane)

Simone (mup, web, fb, lk) ha rapidamente illustrato le finalità del Meetup "Machine-Learning e Data-Science" 
“Vogliamo creare una comunità di appassionati e professionisti di ML, AI e Data Science” ha detto Simone, “un luogo dove trovare risposte a domande quali:

  1. Sono appassionato di Ml, dove trovo altri esperti?
  2. Cerchiamo qualcuno esperto di ML, ne conosci?
  3. Mi piacerebbe avvicinarmi a ML, come faccio?”

gabrielenocco.jpeg

(Gabriele Nocco)

Gabriele Nocco (mup , fb, lk) ha annunciato che il secondo evento del Meetup si terrà a Roma il 16 febbraio 2017 al Talent Garden di Cinecittà (mappa) in collaborazione con il Google Dev Group di Roma . Per partecipare occorre registrarsi – gratuitamente – a EventBrite.
“Parleremo di TensorFlow  e proietteremo il keynote ed alcuni momenti salienti del primo TensorFlow Dev Summit per tutti gli appassionati di deep learning e, grazie anche alla gentile sponsorship di Google, avremo il nostro secondo momento di networking condiviso nei bellissimi spazi a nostra disposizione” ha detto Gabriele.
innocenzo-sansone.jpg

(Innocenzo Sansone)

È intervenuto anche Innocenzo Sansone (fb, tw , lk)  – tra gli organizzatori e sponsor – che ha ricordato che il 24 e 25 marzo 2017 a Roma avrà luogo Codemotion  nel quale è previsto – tra gli altri – anche un track specifico sul Machine Learning.

Gianluca Mauro (AI-Academy ): Intelligenza artificiale per il tuo business

170203-gianluca-mauro2.jpg

(Gianluca Mauro)

Gianluca (blog, lk) – ingegnere, imprenditore, esperto di AI, ML e Data Science – è anche uno dei 3 fondatori – insieme a Simone Totaro  (lk)  e Nicolò Valigi (lk) - di AI-Academy  una startup che si prefigge di favorire l’utilizzo dell’Intelligenza Artificiale nei processi di business aziendali (vedi AI Academy manifesto).

ai-academy.jpg

Breve storia dell’Intelligenza Artificiale

Nella prima parte del suo intervento Gianluca ha delineato lo sviluppo storico dell’Intelligenza artificiale.
Gli inizi della IA si devono alla conferenza tenutesi a Dartmouth – USA - nel 1956  ed organizzata da John McCarthy, Marvin Minsky, Nathaniel Rochester e Claude Shannon: per la prima volta si parla di “intelligenza artificiale” e viene indicato l’obiettivo di “costruire una macchina che simuli totalmente l’intelligenza umana” proponendo temi di ricerca che negli anni successivi avranno un grande sviluppo: reti neurali, teoria della computabilità, creatività e elaborazione del linguaggio naturale.

dartmouth.jpg
 
Fonte immagine: http://www.scaruffi.com/mind/ai.html

La ricerca IA viene generosamente finanziata dal governo statunitense fino alla metà degli anni 60: di fronte alla mancanza di risultati concreti i finanziamenti cessano dando origine al primo “AI winter” (1966 – 1980).
Negli anni 80 l’IA riprende vigore grazie a un cambio di paradigma: invece di inseguire l’obiettivo di riprodurre artificialmente l’intera intelligenza umana si ripiega sulla realizzazione di “Sistemi esperti” in grado di simulare le conoscenze in ambiti delimitati.
Anche questo 2° tentativo ha però scarsa fortuna causando il nuovo “AI winter” che si protrae fino agli inizi degli anni 90 quando comincia a imporsi una nuova disciplina: il Machine Learning.

Cos’è il Machine Learning?

AI.jpg

Fonte immagine: http://www.thebluediamondgallery.com/tablet/a/artificial-intelligence.html

Il Machine Learning – ha spiegato Gianluca – è una branca dell’Intelligenza Artificiale che si propone di realizzare algoritmi che a partire dai dati ricevuti in input si adattino in maniera automatica così da produrre risultati “intelligenti” quali previsioni e raccomandazioni.
Gianluca ha fatto l’esempio di un bambino che impara a camminare: non serve conoscere la legge di gravità ed è sufficiente osservare come cammina la mamma e riprovare fino a che non si trova l’equilibrio.

Cos’è il Deep Learning?

Il deep learning è un sottoinsieme del Machine Learning e si rivolge alla progettazione, allo sviluppo, al testing e soprattutto al traning delle reti neurali e di altre tecnologie per l’apprendimento automatico.
Il deep learning è alla base degli spettacolari successi dell’IA nel campo dei giochi da tavolo: la vittoria agli scacchi di Deep Blue di IBM contro il campione del mondo in carica, Garry Kasparov,  e la vittoria di Alphago di Google contro il campione del mondo di GO .

This is the golden age of Artificial Intelligence

Secondo Gianluca Mauro questo è il momento magico per l’IA perché finalmente abbiamo gli strumenti – algoritmi, data, computing power – necessari per realizzare applicazioni di ML a costi sempre più bassi.
Gli algoritmi sono ormai collaudati grazie ai lavori pubblicati negli ultimi anni a cominciare da quelli di Corinna Cortes (“Support-vector networks”) e Davide Rumelhart (“Learning representations by back-propagating errors”).
Il computing power è rappresentato principalmente dalla grande quantità di tecnologie open source a disposizione.
La combinazione di tutti questi fattori è rivoluzionario come dice Chris Dixon, uno dei più noti esponenti del Venture capital USA:
“La maggior parte degli studi di ricerca, degli strumenti e degli strumenti sw legati al ML sono open source. Tutto ciò ha avuto un effetto di democratizzazione che consente a piccole imprese e addirittura a singoli individui di realizzare applicazioni veramente potenti. WhatsApp è stato in grado di costruire un sistema di messaggistica globale che serve 900 milioni di utenti assumendo solo 50 ingegneri rispetto alle migliaia di ingegneri che sono stati necessari per realizzare i precedenti di sistemi di messaggistica. Questo "effetto WhatsApp" sta accadendo adesso nell’Intelligenza Artificiale. Strumenti software come Theano e TensorFlow, in combinazione con i cloud data centers per i training, e con le GPU a basso costo per il deployment consentono adesso a piccole squadre di ingegneri di realizzare sistemi di intelligenza artificiale innovativi e competitivi”.
Secondo Gianluca l’IA presto sarà una necessità per qualsiasi azienda o per citare Pedro Domingos: “A company without Machine Learning can’t keep up with one that uses it”.
Secondo Andrew Ng, chief scientist in Baidu, AI e ML stanno già trasformando le imprese perché le obbligheranno a rivoluzionare i loro processi produttivi così come accadde nell’800 quando fu disponibile per la prima volta elettricità a basso costo (video).
Questo cambiamento culturale è già avvertibile nel Venture Capital e nel Merger & Acquisition: le grandi imprese non cercano solo startup che si occupano di ricerca pura nell’ML ma startup che realizzano servizi e prodotti con ML embedded.
“Siamo all’alba di una nuova era” ha concluso Gianluca “quella del Machine Learning as a feature”.

Lorenzo Ridi (Noovle): Serverless Data Architecture at scale on Google Cloud Platform

lorenzo-ridi.jpeg

(Lorenzo Ridi)

Lorenzo Ridi (mup, fb, lk),  tw) ha presentato un caso d’uso concreto (qui disponibile nella sua versione integrale , anche su SlideShare) per mostrare i vantaggi di usare l’architettura su Google Cloud Platform, attraverso sole componenti serverless, in applicazioni con Machine-Learnin embedded.
Il caso d’uso riguarda una società che con l’avvicinarsi del Black Friday decide di commissionare un’indagine sui social, e in particolare su Twitter, per catturare insights utili a posizionare correttamente i propri prodotti, prima e durante l’evento: questo è tanto più cruciale quanto si considera l’enorme dimensione del catalogo aziendale perché indirizzare in modo sbagliato la propria campagna pubblicitaria e promozionale sarebbe un errore fatale.
Tuttavia, per gestire il forte traffico atteso durante l’evento, gli ingegneri di ACME decidono di abbandonare le tecnologie tradizionali, e di implementare questa architettura su Google Cloud Platform, attraverso sole componenti serverless:

170203-google-architecture-ml1.jpg
 
Ingestion

Per recuperare i dati viene implementata una semplice applicazione Python che, attraverso la libreria TweePy, accede alle Streaming API di Twitter recuperando il flusso di messaggi riguardanti il Black Friday e le tematiche ad esso connesse.
Per fare in modo che anche questa componente mantenga gli standard di affidabilità prefissati, si decide di eseguirla, all’interno di un container Docker, su Google Container Engine, l’implementazione di Kubernetes su Google Cloud Platform. In questo modo, non dovremo preoccuparci di eventuali outage o malfunzionamenti. Tutto è gestito (e all’occorrenza automaticamente riavviato) da Kubernetes.

170203-google-architecture-ml2.jpg
 
Innanzitutto creiamo l’immagine Docker che utilizzeremo per il deploy. A questo scopo è sufficiente redigere opportunamente un Dockerfile che contenga le istruzioni per installare le librerie necessarie, copiare la nostra applicazione ed avviare lo script:

170203-google-architecture-ml3.jpg
 
Et voilà! L’immagine Docker è pronta per essere eseguita ovunque: sul nostro laptop, su un server on-prem o, come nel nostro caso, all’interno di un cluster Kubernetes. Il deploy su Container Engine è semplicissimo, con il tool da riga di comando di Google Cloud Platform: tre sole istruzioni che servono a creare il cluster Kubernetes, acquisire le credenziali di accesso ed eseguire l’applicazione in modo scalabile ed affidabile all’interno di un ReplicationController.
Il secondo elemento della catena, la componente cioè verso la quale la nostra applicazione invierà i tweet, è Google Pub/Sub. una soluzione middleware fully-managed, che realizza un’architettura Publisher/Subscriber in modo affidabile e scalabile.
Nella fase di processing, utilizziamo altri due strumenti della suite Google Cloud Platform:

  • Google Cloud Dataflow è un SDK Java open source – adesso noto sotto il nome di Apache Beam – per la realizzazione di pipeline di processing parallele. Inoltre, Cloud Dataflow è il servizio fully managed operante sull’infrastruttura Google, che esegue in modo ottimizzato pipeline di processing scritte con Apache Beam.
  • Google BigQuery è una soluzione di Analytic Data Warehouse fully managed. Le sue performance strabilianti, che abbiamo avuto modo di sottolineare più volte, lo rendono una soluzione ottimale all’interno di architetture di Data Analytics.

La pipeline che andiamo a progettare è estremamente semplice. Di fatto non farà altro che trasformare la struttura JSON che identifica ogni Tweet, inviata dalle API di Twitter e recapitata da Pub/Sub, in una struttura record BigQuery. Successivamente, attraverso le BigQuery Streaming API, ogni record verrà scritto in una tabella in modo tale che i dati possano essere immediatamente analizzati.
 170203-google-architecture-ml4.jpg
Il codice della pipeline è estremamente semplice; questo è in effetti uno dei punti di forza di Apache Beam rispetto ad altri paradigmi di processing, come MapReduce. Tutto ciò che dobbiamo fare è creare un oggetto di tipo Pipeline e poi applicare ripetutamente il metodo apply() per trasformare i dati in modo opportuno. È interessante osservare come i dati vengano letti e scritti utilizzando due elementi di I/O inclusi nell’SDK: PubSubIO e BigQueryIO. Non è quindi necessario scrivere codice boilerplate per implementare l’integrazione tra i sistemi.

Machine learning

Per visualizzare graficamente i risultati utilizziamo Google Data Studio, uno strumento della suite Google Analytics che consente di costruire visualizzazioni grafiche di vario tipo a partire da diverse sorgenti dati, tra le quali ovviamente figura anche BigQuery.
Possiamo poi condividere le dashboard, oppure renderle pubblicamente accessibili, esattamente come faremmo con un documento Google Drive.

170203-ml5.jpg
 
In questo grafico è riportato il numero di Tweet collezionato da ogni stato dell’Unione. Sicuramente d’impatto, ma non molto utile per il nostro scopo. In effetti, dopo un po’ di analisi esplorativa dei dati, ci accorgiamo che con i soli tweet collezionati non riusciamo a fare analisi molto “avanzate”. Dobbiamo quindi rivedere la nostra procedura di processing per cercare di inferire qualche elemento di conoscenza più “interessante”.
Google Cloud Platform ci viene in aiuto, in questo caso offrendoci una serie di API, basate su algoritmi di Machine Learning, il cui scopo è esattamente aggiungere un pizzico di “intelligenza” al nostro processo di analisi. In particolare utilizzeremo le Natural Language API, che ci saranno utili per recuperare il sentiment di ogni tweet, cioè un indicatore numerico della positività (o negatività) del testo contenuto nel messaggio.

170203-google-architecture-ml6.jpg
 
La API è molto semplice da usare: prende in ingresso un testo (il nostro tweet) e restituisce due parametri:

  • Polarity (FLOAT variabile da -1 a 1) esprime l’umore del testo: valori positivi denotano sentimenti positivi.
  • Magnitude (FLOAT variabile da 0 a +inf) esprime l’intensità del sentimento. Valori più alti denotano sentimenti più forti (siano essi rabbia o gioia).

La nostra personale semplicistica definizione di “sentiment” altro non è che il prodotto di questi due valori. In questo modo siamo in grado di assegnare un valore numerico ad ogni tweet – ed auspicabilmente, di tirarne fuori delle statistiche interessanti!
La pipeline Dataflow viene modificata in modo da includere, oltre al flusso precedente, anche questo nuovo step. Tale modifica è molto semplice, e visto il modello di programmazione di Cloud Dataflow, permette un notevole riuso del codice esistente.

170203-google-architecture-ml7.jpg
 

Con questi nuovi dati possiamo realizzare delle analisi molto più interessanti, che ci informano sulla distribuzione geografica e temporale del “sentimento” riguardante l’evento Black Friday.
La mappa che segue, ad esempio, mostra il sentiment medio registrato in ognuno degli stati degli US, colori più scuri rappresentano sentiment più negativi (quel quadrato rosso là in mezzo è il Wyoming).

170203-google-architecture-ml8.jpg
 
Quest’altra analisi invece riporta l’andamento del sentiment legato ai tre maggiori vendor statunitensi: Amazon, Walmart e Best Buy. A partire da questa semplice analisi, con un po’ di drill-down sui dati, siamo riusciti a carpire alcuni fatti interessanti:

  • il popolo di Twitter non ha apprezzato la decisione di Walmart di anticipare l’apertura delle proprie vendite al giorno precedente il Black Friday, la festa nazionale del Thanksgiving Day. La popolarità di Walmart è stata infatti minata fin dai primi di Novembre da questa decisione  – d’altronde, la tutela dei lavoratori è un tema universale.
  • Le vendite promozionali di Amazon (aperte il 18 Novembre, quindi con anticipo rispetto al Black Friday) sono state inizialmente duramente criticate dagli utenti, con un crollo della popolarità che ha raggiunto il suo minimo il 22. In seguito però il colosso delle vendite online ha recuperato terreno rispetto a Best Buy, che invece sembra aver mantenuto intatta la sua buona reputazione per tutto il periodo.

170203-google-architecture-ml9.jpg

Contatti

Leggi anche

AG-Vocabolario: 

          TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level ML Frameworks – KDD 2017   
Martin Wicke give a quick overview of considerations that went into designing TensorFlow’s APIs, described in the paper “Train and Distribute: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks” presented at goo.gl/itoiZB Watch Martin’s talk at I/O ’17 about effective TensorFlow for non-experts: goo.gl/ZDMBcX Subscribe to the Google Developers channel: goo.gl/mQyv5L And here’s our […]
          Mosquito classifier | Tensorflow   
Researcher can use this classifier to automatically distinguish a particular type of mosquito from a
          Автоэнкодеры в Keras, Часть 6: VAE + GAN   

Содержание



В позапрошлой части мы создали CVAE автоэнкодер, декодер которого умеет генерировать цифру заданного лейбла, мы также попробовали создавать картинки цифр других лейблов в стиле заданной картинки. Получилось довольно хорошо, однако цифры генерировались смазанными.
В прошлой части мы изучили, как работают GAN’ы, получив довольно четкие изображения цифр, однако пропала возможность кодирования и переноса стиля.

В этой части попробуем взять лучшее от обоих подходов путем совмещения вариационных автоэнкодеров (VAE) и генеративных состязающихся сетей (GAN).

Подход, который будет описан далее, основан на статье [Autoencoding beyond pixels using a learned similarity metric, Larsen et al, 2016].



Иллюстрация из [1]
Читать дальше →
          motd do Ubuntu 17.04 faz propaganda de série de TV a cabo   

Eu gostei do comentário #2 sobre o bug: ter um recurso para atualizar a mensagem pós-login com conteúdo atualizado e relevante pode ser bem útil, mas usá-lo para fazer marketing da Canonical e/ou de um seriado é um convite a desativá-lo e assim torná-lo indisponível no momento em que ele poderia vir a ser relevante.

Não é pegadinha do malandro. O bug de número 1701068(link de referência) aberto pelo usuário Zachary Fouts descreve que há um script localizado em /etc/update-motd.d/50-motd-news que busca informação na url https://motd.ubuntu.com/ e anexa ao final do MOTD.

Atualmente, esta url possui um conteúdo em texto puro que faz propaganda pra série Silicon Valley da HBO. Conteúdo:

* How HBO's Silicon Valley built "Not Hotdog" with mobile TensorFlow, Keras & React Native on Ubuntu - https://ubu.one/HBOubu

O que é questionado pelo criador do bug é a importância do MOTD, que deveria ser usado em Servidores Ubuntu para conteúdo relevante como incidentes e problemas de segurança e não itens que são relevantes apenas para usuários de desktop. A mesma reclamação é feita por outros usuários que comentam no incidente.

Enviado por Nícolas Wildner (nicolasgauchoΘgmail·com)

O artigo "motd do Ubuntu 17.04 faz propaganda de série de TV a cabo" foi originalmente publicado no site BR-Linux.org, de Augusto Campos.


          Saturday Morning Videos: The Changing Landscape of Education and The Awesome Siraj Raval   
Starting in 2013, one of the most important thing we noticed in the roster of the Paris Machine Learning meetup was a new breed of people young and less young who would get their education from Andrew Ng's coursera course*. That trend has accelerated over time and much of the close to 5800 members of the meetup are getting their education from non-traditional outlets. We sometimes  have stories of students teaching their engineering school professors. Some students even take time off from their studies to focus on getting an education through Kaggle competitions.

In this new education landscape, I recently noticed the awesome Siraj Raval, here is one of his recent YouTube video on second order approximation:


All of Siraj's videos are on his channel. What's interesting here is that subjects are not simple as the video above. Siraj often does some live coding and gets into recent subject of interest in the ML literature such as :

Siraj obviously has a GitHub repo.


Andrew was kind enough to give a talk at the end of season 1 of the meetup. The room at Google was filled with 250 people it could accomodate. It was in large due to Andrew's status of having been a teacher to many in the room through the Coursera outlet. 


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          7月1日(土)のつぶやき その8   

新曲です。中国の音声合成ソフトMUTAのライブラリである「嫣汐」のキャラクターソング「嫣橙色」の作曲を担当しました。中国の動画配信サイト、ビリビリ動画で公式PVが公開中 bilibili.com/video/av116198…

— Yunomi🍵 (@iamyunomi) 2017年7月1日 - 00:08

これ面白そうなんだけど…
生成されたレシピを元にシェフが料理してくれるレストランとか新しい! twitter.com/goroman/status…

— わっふるめーかー@7/15 VRZONE (@waffle_maker) 2017年7月1日 - 21:39

TensorFlowで料理を機械学習させてこの世にない新しい料理のレシピ生成させるのかと思ったら違った twitter.com/laiso/status/8…

— GOROman (@GOROman) 2017年7月1日 - 20:25

TensorFlow機械学習クックブック Pythonベースの活用レシピ60+ インプレス amazon.co.jp/dp/4295002003/… @amazonJPから

— laiso (@laiso) 2017年7月1日 - 19:35

初代等身大さんはずっと眺めていたい。ジョイポリスんときも一日あの前の席に座ってたな。久々に見れて良かった。

— なおっさん (@nao_square) 2017年7月1日 - 21:32

VRエキスパートたちが語った、非ゲーム業界のVRテクノロジーとビジネスの可能性は?_[PR]|CodeIQ MAGAZINE codeiq.jp/magazine/2017/… @codeiqさんから

座談会形式でモデレーター役の初めて書きましたー宜しくお願いします

— kure (@kure_kure_zo) 2017年7月1日 - 21:20

独身で月40こえると、仕事後に同僚と飲みに行くのは
苦行でもなんでもなく、純粋な楽しみになる。

だから今どきの若い者は飲みを嫌がる、とか嘆いているオジサンは
どうか月40万あげてほしい。
きっと向こうから純粋に楽しそうな笑顔で誘ってくれるぞ。

— むーちゃ@めんどくさいオッサン (@mucha610610) 2017年7月1日 - 10:19

昔ネトウヨが中韓の資本雇われる技術者を見て「愛国心」が足りないとか言ってたが逆に
自分の経営ミスを自分の給料下げもせずに部下のリストラで解決するような経営者の為にある愛国心なんてどこにあるんだと聞きたい

— AveNoF (@AveNoF_Pyrit) 2017年7月1日 - 09:06

Him stuck pic.twitter.com/zAwHDn2BU2

— Boah (@zboah) 2017年6月22日 - 01:15

12 Animal Brothers From Other Mothers buff.ly/2sNrI9z

— Animal Life (@MeetAnimals) 2017年6月21日 - 00:39

@info_PW お疲れ様でした。本日はお会い出て嬉しかったです。(^_^)

— そむにうむ@森山弘樹 VR養成本執筆 (@Somnium) 2017年7月1日 - 23:38

@mizchi 単にインターネットで目立つ人間がそうであるだけで、95%ぐらいのエンジニアは冷や飯食わされてると思う。

— イキリーマン (@joker1007) 2017年7月1日 - 00:54

芸能界で誰が1番喧嘩が強いか?のまとめに書いてあるエピソード、安岡力也がひどい目にあいすぎてて笑う pic.twitter.com/pc6b7i0Pks

— リリングミスト (@RelyingMist) 2017年7月1日 - 22:48

すごい興味深いしすごい / 現代の魔法使い・落合陽一が2台の「ロボホン」に子育てを手伝わせるワケ #アメーバニュース news.ameba.jp/20170701-316

— やまぐち しょうせい (@shosemaru) 2017年7月1日 - 22:23

@yura_greycube @anitemp 若いし、目が濁ってない若者ばかり!(一部除く
打合せ後の疲れもふっとんだんですが、、、PFチェックや原型チェックしてたら焼き鳥食い逃しました!貴重なたんぱく質が!(>_<)

— Kazuhisa SHOUSHIN (@info_PW) 2017年7月1日 - 23:34

中国で、子どもたちの間で大人気となっていた爪ようじやくぎを放つことができるおもちゃのボーガンが禁止され、当局は同国全土のおもちゃ店などに対する一斉摘発を実施した。afpbb.com/articles/-/313…

— AFPBB News (@afpbbcom) 2017年7月1日 - 21:03

とある女子校の文化祭です。 東工大に一票を投じてくれた方を探しています。 pic.twitter.com/GJX1V04Ium

— そーみや (@akutagawalove) 2017年7月1日 - 17:13

@Somnium いまさらですが、アペンドミクちゃん、しっかり見れませんでした!!w
こちらこそ往年の先達者そむにうむ様とお会いできまして恐悦至極!
できれば次回は飲み会の前に研究会したいっすねー。

— Kazuhisa SHOUSHIN (@info_PW) 2017年7月1日 - 23:41

@info_PW そう言えばバタバタしててしっかりお見せできてませんでた。本日は家内の調子が悪く一次会での撤退を余儀なくされましたが、勉強会企画したいと思います。会場は私の勤めてるゲーム専門学校(難波)が利用できます。(^_^)

— そむにうむ@森山弘樹 VR養成本執筆 (@Somnium) 2017年7月1日 - 23:44

アイデアを「気づく」センスを磨き、「10年後のフツウ」を創り出せ!——超AI時代を生き延びる逆説的思考③落合陽一×馬田隆明|BUSINESS INSIDER businessinsider.jp/post-34694 @BIJapanより

— Toshiaki Takeuchi (@tosh728) 2017年7月1日 - 22:15

←Before After→

#フレームアームズガール #FAガール #スティレット #フェイスリペイント pic.twitter.com/O3SFQqVP9l

— bittersweets⋆ (@bsdolls) 2017年7月1日 - 23:32

今夜は子供ビールさん幹事のCG/造形飲み会にに参加しておりました!
みんな若いわぁ!(一部例外を除く

— Kazuhisa SHOUSHIN (@info_PW) 2017年7月1日 - 23:20

秋月の例の件、「エンジニアが炎上」という言い方もできれば止めて欲しい。せめて「自称エンジニア」と書いて欲しい。あれをエンジニアと呼ばれるのは非常にいたたまれない。

— tokoya (@tokoya) 2017年7月1日 - 10:14

SALTBAE! pic.twitter.com/RUtv1CBsP8

— Nusret #saltbae (@nusr_ett) 2017年7月1日 - 23:03

小野ォーッ!!! #遣隋使
sketchfab.com/dino780928/col…

— 岡島(職種がよくわからない) (@okajimania) 2017年7月1日 - 23:29

暗闇の中、人気のない建物を探索すふのは、リアルバイオ、ウォーキングデッド感。(ここにみんなで泊まりました) pic.twitter.com/mv3eLbVQEe

— n_ryota@AX2017 (@n_ryota) 2017年7月1日 - 23:26

中華企業は給料高くても技術だけ吸い取ってポイだのほざいてますが青色LEDを開発した中村氏に日本企業が出した報奨金はたった「2万円」だったんですがね… twitter.com/xrayspex7/stat…

— 涅槃 (@xrayspex7) 2017年7月1日 - 14:11

初任給だけならマイクロソフトも年収700万を提示しているわけで,Huaweiが特殊ってことは無くて,単に日本が技術者を買い叩いていることが知られてきただけなんだけどなw

— M. Morise (忍者系研究者) (@m_morise) 2017年6月30日 - 21:41

ハンドスピナーを回す後輩猫とそれを許さない先輩猫 pic.twitter.com/h74n2130yB

— ねこナビ編集部 (@b_ru_ru) 2017年6月3日 - 10:49

桜花さんが添い寝してくれるVRアプリをプレイした twitter.com/oukaichimon/st…

— Motoaki Tanigo (@tanigox) 2017年7月1日 - 23:13
          Hierarchical Attention Network for Document Classification--tensorflow实现篇   

上周我们介绍了Hierarchical Attention Network for Document Classification这篇论文的模型架构,这周抽空用tensorflow实现了一下,接下来主要从代码的角度介绍如何实现用于文本分类的HAN模型。

数据集

首先介绍一下数据集,这篇论文中使用了几个比较大的数据集,包括IMDB电影评分,yelp餐馆评价等等。选定使用yelp2013之后,一开始找数据集的时候完全处于懵逼状态,所有相关的论文和资料里面出现的数据集下载链接都指向YELP官网,但是官网上怎么都找不到相关数据的下载,然后就各种搜感觉都搜不到==然后就好不容易在github上面找到了,MDZZ,我这都是在写什么,绝对不是在凑字数,单纯的吐槽数据不好找而已。链接如下:
https://github.com/rekiksab/Yelp/tree/master/yelp_challenge/yelp_phoenix_academic_dataset
这里面好像不止一个数据集,还有user,business等其他几个数据集,不过在这里用不到罢了。先来看一下数据集的格式,如下,每一行是一个评论的文本,是json格式保存的,主要有vote, user_id, review_id, stars, data, text, type, business_id几项,针对本任务,只需要使用stars评分和text评论内容即可。这里我选择先将相关的数据保存下来作为数据集。代码如下所示:

{"votes": {"funny": 0, "useful": 5, "cool": 2}, "user_id": "rLtl8ZkDX5vH5nAx9C3q5Q", "review_id": "fWKvX83p0-ka4JS3dc6E5A", "stars": 5, "date": "2011-01-26", "text": "My wife took me here on my birthday for breakfast and it was excellent.  The weather was perfect which made sitting outside overlooking their grounds an absolute pleasure.  Our waitress was excellent and our food arrived quickly on the semi-busy Saturday morning.  It looked like the place fills up pretty quickly so the earlier you get here the better.\n\nDo yourself a favor and get their Bloody Mary.  It was phenomenal and simply the best I've ever had.  I'm pretty sure they only use ingredients from their garden and blend them fresh when you order it.  It was amazing.\n\nWhile EVERYTHING on the menu looks excellent, I had the white truffle scrambled eggs vegetable skillet and it was tasty and delicious.  It came with 2 pieces of their griddled bread with was amazing and it absolutely made the meal complete.  It was the best \"toast\" I've ever had.\n\nAnyway, I can't wait to go back!", "type": "review", "business_id": "9yKzy9PApeiPPOUJEtnvkg"}

数据集的预处理操作,这里我做了一定的简化,将每条评论数据都转化为30*30的矩阵,其实可以不用这么规划,只需要将大于30的截断即可,小鱼30的不需要补全操作,只是后续需要给每个batch选定最大长度,然后获取每个样本大小,这部分我还没有太搞清楚,等之后有时间再看一看,把这个功能加上就行了。先这样凑合用==

#coding=utf-8
import json
import pickle
import nltk
from nltk.tokenize import WordPunctTokenizer
from collections import defaultdict

#使用nltk分词分句器
sent_tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
word_tokenizer = WordPunctTokenizer()

#记录每个单词及其出现的频率
word_freq = defaultdict(int)

# 读取数据集,并进行分词,统计每个单词出现次数,保存在word freq中
with open('yelp_academic_dataset_review.json', 'rb') as f:
    for line in f:
        review = json.loads(line)
        words = word_tokenizer.tokenize(review['text'])
        for word in words:
            word_freq[word] += 1

    print "load finished"

# 将词频表保存下来
with open('word_freq.pickle', 'wb') as g:
    pickle.dump(word_freq, g)
    print len(word_freq)#159654
    print "word_freq save finished"

num_classes = 5
# 将词频排序,并去掉出现次数最小的3个
sort_words = list(sorted(word_freq.items(), key=lambda x:-x[1]))
print sort_words[:10], sort_words[-10:]

#构建vocablary,并将出现次数小于5的单词全部去除,视为UNKNOW
vocab = {}
i = 1
vocab['UNKNOW_TOKEN'] = 0
for word, freq in word_freq.items():
    if freq > 5:
        vocab[word] = i
        i += 1
print i
UNKNOWN = 0

data_x = []
data_y = []
max_sent_in_doc = 30
max_word_in_sent = 30

#将所有的评论文件都转化为30*30的索引矩阵,也就是每篇都有30个句子,每个句子有30个单词
# 不够的补零,多余的删除,并保存到最终的数据集文件之中
with open('yelp_academic_dataset_review.json', 'rb') as f:
    for line in f:
        doc = []
        review = json.loads(line)
        sents = sent_tokenizer.tokenize(review['text'])
        for i, sent in enumerate(sents):
            if i < max_sent_in_doc:
                word_to_index = []
                for j, word in enumerate(word_tokenizer.tokenize(sent)):
                    if j < max_word_in_sent:
                            word_to_index.append(vocab.get(word, UNKNOWN))
                doc.append(word_to_index)

        label = int(review['stars'])
        labels = [0] * num_classes
        labels[label-1] = 1
        data_y.append(labels)
        data_x.append(doc)
    pickle.dump((data_x, data_y), open('yelp_data', 'wb'))
    print len(data_x) #229907
    # length = len(data_x)
    # train_x, dev_x = data_x[:int(length*0.9)], data_x[int(length*0.9)+1 :]
    # train_y, dev_y = data_y[:int(length*0.9)], data_y[int(length*0.9)+1 :]

在将数据预处理之后,我们就得到了一共229907篇文档,每篇都是30*30 的单词索引矩阵,这样在后续进行读取的时候直接根据嵌入矩阵E就可以将单词转化为词向量了。也就省去了很多麻烦。这样,我们还需要一个数据的读取的函数,将保存好的数据载入内存,其实很简单,就是一个pickle读取函数而已,然后将数据集按照9:1的比例分成训练集和测试集。其实这里我觉得9:1会使验证集样本过多(20000个),但是论文中就是这么操作的==暂且不管这个小细节,就按论文里面的设置做吧。代码如下所示:

def read_dataset():
    with open('yelp_data', 'rb') as f:
        data_x, data_y = pickle.load(f)
        length = len(data_x)
        train_x, dev_x = data_x[:int(length*0.9)], data_x[int(length*0.9)+1 :]
        train_y, dev_y = data_y[:int(length*0.9)], data_y[int(length*0.9)+1 :]
        return train_x, train_y, dev_x, dev_y

有了这个函数,我们就可以在训练时一键读入数据集了。接下来我们看一下模型架构的实现部分。

模型实现

按照上篇博客中关于模型架构的介绍,结合下面两张图进行理解,我们应该很容易的得出模型的框架主要分为句子层面,文档层面两部分,然后每个内部有包含encoder和attention两部分。
这里写图片描述
这里写图片描述
代码部分如下所示,主要是用tf.nn.bidirectional_dynamic_rnn()函数实现双向GRU的构造,然后Attention层就是一个MLP+softmax机制,yehe你容易理解。

#coding=utf8

import tensorflow as tf
from tensorflow.contrib import rnn
from tensorflow.contrib import layers

def length(sequences):
#返回一个序列中每个元素的长度
    used = tf.sign(tf.reduce_max(tf.abs(sequences), reduction_indices=2))
    seq_len = tf.reduce_sum(used, reduction_indices=1)
    return tf.cast(seq_len, tf.int32)

class HAN():

    def __init__(self, vocab_size, num_classes, embedding_size=200, hidden_size=50):

        self.vocab_size = vocab_size
        self.num_classes = num_classes
        self.embedding_size = embedding_size
        self.hidden_size = hidden_size

        with tf.name_scope('placeholder'):
            self.max_sentence_num = tf.placeholder(tf.int32, name='max_sentence_num')
            self.max_sentence_length = tf.placeholder(tf.int32, name='max_sentence_length')
            self.batch_size = tf.placeholder(tf.int32, name='batch_size')
            #x的shape为[batch_size, 句子数, 句子长度(单词个数)],但是每个样本的数据都不一样,,所以这里指定为空
            #y的shape为[batch_size, num_classes]
            self.input_x = tf.placeholder(tf.int32, [None, None, None], name='input_x')
            self.input_y = tf.placeholder(tf.float32, [None, num_classes], name='input_y')

        #构建模型
        word_embedded = self.word2vec()
        sent_vec = self.sent2vec(word_embedded)
        doc_vec = self.doc2vec(sent_vec)
        out = self.classifer(doc_vec)

        self.out = out


    def word2vec(self):
        #嵌入层
        with tf.name_scope("embedding"):
            embedding_mat = tf.Variable(tf.truncated_normal((self.vocab_size, self.embedding_size)))
            #shape为[batch_size, sent_in_doc, word_in_sent, embedding_size]
            word_embedded = tf.nn.embedding_lookup(embedding_mat, self.input_x)
        return word_embedded

    def sent2vec(self, word_embedded):
        with tf.name_scope("sent2vec"):
            #GRU的输入tensor是[batch_size, max_time, ...].在构造句子向量时max_time应该是每个句子的长度,所以这里将
            #batch_size * sent_in_doc当做是batch_size.这样一来,每个GRU的cell处理的都是一个单词的词向量
            #并最终将一句话中的所有单词的词向量融合(Attention)在一起形成句子向量

            #shape为[batch_size*sent_in_doc, word_in_sent, embedding_size]
            word_embedded = tf.reshape(word_embedded, [-1, self.max_sentence_length, self.embedding_size])
            #shape为[batch_size*sent_in_doce, word_in_sent, hidden_size*2]
            word_encoded = self.BidirectionalGRUEncoder(word_embedded, name='word_encoder')
            #shape为[batch_size*sent_in_doc, hidden_size*2]
            sent_vec = self.AttentionLayer(word_encoded, name='word_attention')
            return sent_vec

    def doc2vec(self, sent_vec):
        #原理与sent2vec一样,根据文档中所有句子的向量构成一个文档向量
        with tf.name_scope("doc2vec"):
            sent_vec = tf.reshape(sent_vec, [-1, self.max_sentence_num, self.hidden_size*2])
            #shape为[batch_size, sent_in_doc, hidden_size*2]
            doc_encoded = self.BidirectionalGRUEncoder(sent_vec, name='sent_encoder')
            #shape为[batch_szie, hidden_szie*2]
            doc_vec = self.AttentionLayer(doc_encoded, name='sent_attention')
            return doc_vec

    def classifer(self, doc_vec):
        #最终的输出层,是一个全连接层
        with tf.name_scope('doc_classification'):
            out = layers.fully_connected(inputs=doc_vec, num_outputs=self.num_classes, activation_fn=None)
            return out

    def BidirectionalGRUEncoder(self, inputs, name):
        #双向GRU的编码层,将一句话中的所有单词或者一个文档中的所有句子向量进行编码得到一个 2×hidden_size的输出向量,然后在经过Attention层,将所有的单词或句子的输出向量加权得到一个最终的句子/文档向量。
        #输入inputs的shape是[batch_size, max_time, voc_size]
        with tf.variable_scope(name):
            GRU_cell_fw = rnn.GRUCell(self.hidden_size)
            GRU_cell_bw = rnn.GRUCell(self.hidden_size)
            #fw_outputs和bw_outputs的size都是[batch_size, max_time, hidden_size]
            ((fw_outputs, bw_outputs), (_, _)) = tf.nn.bidirectional_dynamic_rnn(cell_fw=GRU_cell_fw,
                                                                                 cell_bw=GRU_cell_bw,
                                                                                 inputs=inputs,
                                                                                 sequence_length=length(inputs),
                                                                                 dtype=tf.float32)
            #outputs的size是[batch_size, max_time, hidden_size*2]
            outputs = tf.concat((fw_outputs, bw_outputs), 2)
            return outputs

    def AttentionLayer(self, inputs, name):
        #inputs是GRU的输出,size是[batch_size, max_time, encoder_size(hidden_size * 2)]
        with tf.variable_scope(name):
            # u_context是上下文的重要性向量,用于区分不同单词/句子对于句子/文档的重要程度,
            # 因为使用双向GRU,所以其长度为2×hidden_szie
            u_context = tf.Variable(tf.truncated_normal([self.hidden_size * 2]), name='u_context')
            #使用一个全连接层编码GRU的输出的到期隐层表示,输出u的size是[batch_size, max_time, hidden_size * 2]
            h = layers.fully_connected(inputs, self.hidden_size * 2, activation_fn=tf.nn.tanh)
            #shape为[batch_size, max_time, 1]
            alpha = tf.nn.softmax(tf.reduce_sum(tf.multiply(h, u_context), axis=2, keep_dims=True), dim=1)
            #reduce_sum之前shape为[batch_szie, max_time, hidden_szie*2],之后shape为[batch_size, hidden_size*2]
            atten_output = tf.reduce_sum(tf.multiply(inputs, alpha), axis=1)
            return atten_output

以上就是主要的模型架构部分,其实思路也是很简单的,主要目的是熟悉一下其中一些操作的使用方法。接下来就是模型的训练部分了。

模型训练

其实这部分里的数据读入部分我一开始打算使用上次博客中提到的TFRecords来做,但是实际用的时候发现貌似还有点不熟悉,尝试了好几次都有点小错误,虽然之前已经把别人的代码都看明白了,但是真正到自己写的时候还是存在一定的难度,还要抽空在学习学习==所以在最后还是回到了以前的老方法,分批次读入,恩,最起码简单易懂23333.。。。

由于这部分大都是重复性的代码,所以不再进行详细赘述,不懂的可以去看看我前面几篇博客里面关于模型训练部分代码的介绍。

这里重点说一下,关于梯度训练部分的梯度截断,由于RNN模型在训练过程中往往会出现梯度爆炸和梯度弥散等现象,所以在训练RNN模型时,往往会使用梯度截断的技术来防止梯度过大而引起无法正确求到的现象。然后就基本上都是使用的dennizy大神的CNN代码中的程序了。

#coding=utf-8
import tensorflow as tf
import model
import time
import os
from load_data import read_dataset, batch_iter


# Data loading params
tf.flags.DEFINE_string("data_dir", "data/data.dat", "data directory")
tf.flags.DEFINE_integer("vocab_size", 46960, "vocabulary size")
tf.flags.DEFINE_integer("num_classes", 5, "number of classes")
tf.flags.DEFINE_integer("embedding_size", 200, "Dimensionality of character embedding (default: 200)")
tf.flags.DEFINE_integer("hidden_size", 50, "Dimensionality of GRU hidden layer (default: 50)")
tf.flags.DEFINE_integer("batch_size", 32, "Batch Size (default: 64)")
tf.flags.DEFINE_integer("num_epochs", 10, "Number of training epochs (default: 50)")
tf.flags.DEFINE_integer("checkpoint_every", 100, "Save model after this many steps (default: 100)")
tf.flags.DEFINE_integer("num_checkpoints", 5, "Number of checkpoints to store (default: 5)")
tf.flags.DEFINE_integer("evaluate_every", 100, "evaluate every this many batches")
tf.flags.DEFINE_float("learning_rate", 0.01, "learning rate")
tf.flags.DEFINE_float("grad_clip", 5, "grad clip to prevent gradient explode")

FLAGS = tf.flags.FLAGS

train_x, train_y, dev_x, dev_y = read_dataset()
print "data load finished"

with tf.Session() as sess:
    han = model.HAN(vocab_size=FLAGS.vocab_size,
                    num_classes=FLAGS.num_classes,
                    embedding_size=FLAGS.embedding_size,
                    hidden_size=FLAGS.hidden_size)

    with tf.name_scope('loss'):
        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=han.input_y,
                                                                      logits=han.out,
                                                                      name='loss'))
    with tf.name_scope('accuracy'):
        predict = tf.argmax(han.out, axis=1, name='predict')
        label = tf.argmax(han.input_y, axis=1, name='label')
        acc = tf.reduce_mean(tf.cast(tf.equal(predict, label), tf.float32))

    timestamp = str(int(time.time()))
    out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
    print("Writing to {}\n".format(out_dir))

    global_step = tf.Variable(0, trainable=False)
    optimizer = tf.train.AdamOptimizer(FLAGS.learning_rate)
    # RNN中常用的梯度截断,防止出现梯度过大难以求导的现象
    tvars = tf.trainable_variables()
    grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvars), FLAGS.grad_clip)
    grads_and_vars = tuple(zip(grads, tvars))
    train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)

    # Keep track of gradient values and sparsity (optional)
    grad_summaries = []
    for g, v in grads_and_vars:
        if g is not None:
            grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name), g)
            grad_summaries.append(grad_hist_summary)

    grad_summaries_merged = tf.summary.merge(grad_summaries)

    loss_summary = tf.summary.scalar('loss', loss)
    acc_summary = tf.summary.scalar('accuracy', acc)


    train_summary_op = tf.summary.merge([loss_summary, acc_summary, grad_summaries_merged])
    train_summary_dir = os.path.join(out_dir, "summaries", "train")
    train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)

    dev_summary_op = tf.summary.merge([loss_summary, acc_summary])
    dev_summary_dir = os.path.join(out_dir, "summaries", "dev")
    dev_summary_writer = tf.summary.FileWriter(dev_summary_dir, sess.graph)

    checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
    checkpoint_prefix = os.path.join(checkpoint_dir, "model")
    if not os.path.exists(checkpoint_dir):
        os.makedirs(checkpoint_dir)
    saver = tf.train.Saver(tf.global_variables(), max_to_keep=FLAGS.num_checkpoints)

    sess.run(tf.global_variables_initializer())

    def train_step(x_batch, y_batch):
        feed_dict = {
            han.input_x: x_batch,
            han.input_y: y_batch,
            han.max_sentence_num: 30,
            han.max_sentence_length: 30,
            han.batch_size: 64
        }
        _, step, summaries, cost, accuracy = sess.run([train_op, global_step, train_summary_op, loss, acc], feed_dict)

        time_str = str(int(time.time()))
        print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, cost, accuracy))
        train_summary_writer.add_summary(summaries, step)

        return step

    def dev_step(x_batch, y_batch, writer=None):
        feed_dict = {
            han.input_x: x_batch,
            han.input_y: y_batch,
            han.max_sentence_num: 30,
            han.max_sentence_length: 30,
            han.batch_size: 64
        }
        step, summaries, cost, accuracy = sess.run([global_step, dev_summary_op, loss, acc], feed_dict)
        time_str = str(int(time.time()))
        print("++++++++++++++++++dev++++++++++++++{}: step {}, loss {:g}, acc {:g}".format(time_str, step, cost, accuracy))
        if writer:
            writer.add_summary(summaries, step)

    for epoch in range(FLAGS.num_epochs):
        print('current epoch %s' % (epoch + 1))
        for i in range(0, 200000, FLAGS.batch_size):
            x = train_x[i:i + FLAGS.batch_size]
            y = train_y[i:i + FLAGS.batch_size]
            step = train_step(x, y)
            if step % FLAGS.evaluate_every == 0:
                dev_step(dev_x, dev_y, dev_summary_writer)

当模型训练好之后,我们就可以去tensorboard上面查看训练结果如何了。

训练结果

训练起来不算慢,但是也称不上快,在实验室服务器上做测试,64G内存,基本上2秒可以跑3个batch。然后我昨天晚上跑了之后就回宿舍了,回来之后发现忘了把dev的数据写到summary里面,而且现在每个epoch里面没加shuffle,也没跑很久,更没有调参,所以结果凑合能看出一种趋势,等过几天有时间在跑跑该该参数之类的看能不能有所提升,就简单上几个截图吧。
这里写图片描述
这里写图片描述
这里写图片描述

最后的最后再贴上几个链接,都是在学习和仿真这安论文的时候看到的一些感觉不错的博客之类的:
1,richliao,他关于这篇文章写了三篇博客,分别从CNN/RNN/HAN逐层递进进行介绍,写得很不错,可以加深理解。不过是使用keras实现的,博客和代码链接如下:
https://richliao.github.io/
https://github.com/richliao/textClassifier
2,yelp数据集下载链接:
https://github.com/rekiksab/Yelp/tree/master/yelp_challenge/yelp_phoenix_academic_dataset
3,EdGENetworks,这是一个使用pytorch实现的链接,其实代码我没怎么看,但是发现背后有一个屌屌的公司explosion.ai,博客里面还是写了很干货的,要好好学习下。
https://github.com/EdGENetworks/attention-networks-for-classification
https://explosion.ai/blog/deep-learning-formula-nlp
4,ematvey,这个博主使用tensorflow实现了一个版本,我也参考了他数据处理部分的代码,但是感觉程序有点不太容易读,给出链接,仁者见仁。
https://github.com/ematvey/deep-text-classifier

作者:liuchonge 发表于2017/7/2 16:08:08 原文链接
阅读:4 评论:0 查看评论

          (USA-WA-Seattle) Machine Learning Software Dev Engineer   
Have you ever wanted to work on state of the art computer vision, natural language processing and applied machine learning that will make a lasting impact on society? We are looking for brilliant Machine Learning Software Dev Engineers who have the passion to tackle tough problems by bringing cutting edge deep learning technologies to consumer IoT products at Amazon! As a Machine Learning Software Dev Engineer on the Amazon AI Team, you will design and develop fast, efficient, and highly scalable deep learning algorithms that are applied to challenging every-day use case problems. You'll work with senior scientists and engineers within Amazon AI and develop high quality software that is robust and reliable. Software Engineers at Amazon do so much more than just software development. We'll be looking at you to help: · Decide what features to build. · Drive software engineering best practice. · Design distributed and scalable systems. · Test and document the software you develop. Job location can be either Palo Alto, CA or Seattle, WA. · BS, MS in Computer Science, Applied Math or related Engineering fields with 5+ years of relevant work experience. · Computer Science fundamentals in object-oriented design, data structures, high-performance computing (HCP).
 · Computer Science fundamentals in algorithm design, complexity analysis, problem solving and diagnosis.
 · Proficiency in, at least, one modern programming language such as Java, C++ and Python. · Can translate user inputs to software requirements and design specifications and effectively communicate with team members. · Ph.D. with 3 years of relevant experience. · Experience with machine learning, deep learning, data mining, and/or statistical analysis tools. · Experience taking projects from scoping requirements through V1 launch and V2 iterations. · Knowledge of professional software engineering practices and best practices for the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations. · Experience with highly distributed, multi-tenet systems with clear state-full/state-less boundaries. · Experience designing high performance software and algorithms for resource constrained IoT and mobile environments. · Proficiency training large scale models in, at least, one modern deep learning engine such as MXNet, Tensorflow, Caffe/Caffe2, Keras, PyTorch/Torch and Theano. · Experience in GPU, FPGA, DSP acceleration and performance tuning. · Experience in production-scale software development with ML/AI, computer vision and smart IoT device. AMZR Req ID: 553167 External Company URL: www.amazon.com