fluentd の履歴差分(No.14)

追加された行はこの色です。
削除された行はこの色です。
-ログ収集ツール

*インストール [#k58a704b]

-AmazonLinuxなら楽勝。

http://dev.classmethod.jp/cloud/td-agent2-amazon-linux/

上記ページに沿って行った。CentOSでも同じ手順。

手動で実行するならば以下の通りに実行する

**yum.repositoryの追加 [#ne69bde8]

- vi /etc/yum.repos.d/td.repo

 [treasuredata]
 name=TreasureData
 baseurl=http://packages.treasure-data.com/redhat/$basearch
 gpgcheck=0

**td-agentのインストール [#z0dbacbe]

 yum install td-agent
 

*設定ファイル [#wb9dd2c6]

sourceで入力を定義して、matchで処理を行う。matchで複数の処理はできないので別々のプラグインで複数処理をしたい場合はtagをつける。

 <match apache.access>
   type file
   path /var/tmp/apache_all.log
   tag next.apache.access
 </match>
 <match next.apache.access>
   type file
   path /var/tmp/apache_all2.log
 </match>

 
**設定ファイルのインクルード [#y077ffd5]

 include conf.d/*.conf

**httpポート8888で待ち受け [#g838a180]

 # http://localhost:8888/<tag>?json=<json>
 <source>
   type http
   port 8888
 </source>

type forwardの場合はhttpアクセスはできないがそのポートで待ち受けすることになる。

**type(subtype)の説明 [#y81ab9e0]

|type名|簡単な概要|
|null|転送せずに捨てる|
|forest|タグ名を置換変数化できるので、まとめて同じような設定をしたいときに使う|
|rewrite_tag_filter|正規表現でタグづけできる|
|record_modifier|新たに属性を追加できる。たとえばApacheログにホスト名を付与したりとか|

**ローカルのファイルを転送する。 [#ed3e1947]

 <source>
   type tail
   format apache
   path /var/log/httpd/*_access_log
   tag apache.access
   pos_file /tmp/fluentd-apache.pos
 </source>
 <match apache.access>
        type s3
        aws_key_id 
        aws_sec_key 
        s3_bucket bucket_name
        s3_endpoint bucket_name.s3-website-ap-northeast-1.amazonaws.com
        path logs/
        buffer_path /var/tmp/fluentd
        time_slice_format %Y%m%d/%H_apache.log
        time_slice_wait 30m
        flush_interval 60s // この感覚でS3にputするので一日1440リクエストで危うくクラウド破産！
 </match>
 <source>
   type   tail
   path   /var/log/httpd/error_log
   format apache_error
   tag    apache.error
   pos_file /tmp/apache_error.pos
 </source>
 # 送り先を Fluentd の標準ログへ出力します
 <match apache.error>
   type stdout
 </match>


 <source>
   type tail
   path /var/log/httpd/access_log
   pos_file /var/tmp/access_log.pos
   tag httpd
   format none
 </source>
 # 送り先を Fluentd の標準ログへ出力します
 <match httpd>
   type stdout
 </match>

*format [#e5634f6d]

**主要フォーマット [#i61dd916]

|フォーマット名|入力文字例|備考|
|none|入力そのまま||
|none_with_hostname||入力文字列にhost情報|
|ltsv|domain:example.com|ラベル付きのTSV|
|apache2|apacheのcombined|カスタマイズしてたらNG|
|apache.error|apacheのerrorログ|カスタマイズしてたらNG|
|csv,tsv|example.com,/hoge|keys domain,pathなどとキーを別個定義|

**フィルタリング正規表現 [#n3ad45ed]

formatを自分で作る場合rubyの正規表現の知識が必須。

***Apacheの場合(combined以外) [#d9dc7d93]

日付の部分の正規表現がとてもめんどくさい。\[(?<time>[^\]]+)\]がその正規表現。フォーマットも指定しないとだめ。

 format /^(?<host>[^ ]+) [^ ]+ [^ ]+ \[(?<time>[^\]]+)\] (?<message>[^ ]+).*$/
 time_format %d/%b/%Y:%T %z 

***参考サイト [#g12c4414]

http://diary.tachibanakikaku.com/2013/12/fluentdformat.html

***手元で正規表現テスト [#jba8ae91]

 #!/usr/bin/env ruby
 # -*- coding: utf-8 -*-
 require 'time'
 require 'fluent/log'
 require 'fluent/config'
 require 'fluent/engine'
 require 'fluent/parser'
 $log ||= Fluent::Log.new
 # debug
 log = ''
 format = //
 time_format = ''
 parser = Fluent::TextParser::RegexpParser.new(format, 'time_format' => time_format)
 puts parser.call(log)

 /usr/lib64/fluent/ruby/bin/ruby fluenttest.ruby

***テスト実行サイト [#xb89bdf4]

Fluentular: a Fluentd regular expression editor
http://fluentular.herokuapp.com/

*実行 [#eedf55e8]

**トラブルシューティング [#u0d6c6ea]

+読み込みファイルの指定にワイルドカードが使えないわかがない！→後で修正
+読み込みにはtd-agentグループ権限が付与されていないとエラー
+combinedがパターンマッチされない・・これはカスタマイズしている可能性もあるので今後調査。→カスタマイズしてたら取り込まれない！

*Tips [#k3e8b6af]

-secure messageの取り込み

http://y-ken.hatenablog.com/entry/fluentd-syslog-permission

**プラグインのgemインストール [#rbd89c52]

yumインストールした場合はtd-agentが管理するrubyでインストールする必要がある。

 sudo /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-zabbix

**既存ログの取り込み [#id232f2d]

posファイルを変更してもダメ！tailプラグインしかないのがイタイ。