{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## TD;LR\n", "\n", "`prob.argmax(axis=1)` を `hack(prob)` に**置き換える** だけで,どんなモデルでも**確実にスコアUP↑**ができる関数 `hack` の紹介です。\n", "\n", "以下に実装とデモを示しますが,`pip install -U pulp` が事前に必要です。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 実装\n", "import pulp # pip install pulp==2.3\n", "import numpy as np\n", "\n", "\n", "N_CLASSES = [404, 320, 345, 674] # @yCarbonによる推定(過去フォーラム参照)\n", "\n", "# 制約付き対数尤度最大化問題を解く\n", "def hack(prob):\n", " logp = np.log(prob + 1e-16)\n", " N = prob.shape[0]\n", " K = prob.shape[1]\n", "\n", " m = pulp.LpProblem('Problem', pulp.LpMaximize) # 最大化問題\n", "\n", " # 最適化する変数(= 提出ラベル)\n", " x = pulp.LpVariable.dicts('x', [(i, j) for i in range(N) for j in range(K)], 0, 1, pulp.LpBinary)\n", " \n", " # log likelihood(目的関数)\n", " log_likelihood = pulp.lpSum([x[(i, j)] * logp[i, j] for i in range(N) for j in range(K)])\n", " m += log_likelihood\n", " \n", " # 各データについて,1クラスだけを予測ラベルとする制約\n", " for i in range(N):\n", " m += pulp.lpSum([x[(i, k)] for k in range(K)]) == 1 # i.e., SOS1\n", " \n", " # 各クラスについて,推定個数の合計に関する制約\n", " for k in range(K):\n", " m += pulp.lpSum([x[(i, k)] for i in range(N)]) == N_CLASSES[k]\n", " \n", " m.solve() # 解く\n", "\n", " assert m.status == 1 # assert 最適 <=>(実行可能解が見つからないとエラー)\n", "\n", " x_ast = np.array([[int(x[(i, j)].value()) for j in range(K)] for i in range(N)]) # 結果の取得\n", " return x_ast.argmax(axis=1) # 結果をonehotから -> {0, 1, 2, 3}のラベルに変換" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# デモ\n", "prob = model.predict_proba(test_X)\n", "\n", "# before\n", "y = prob.argmax(axis=1) + 1\n", "\n", "# after\n", "y = hack(prob) + 1 # +0.01 ~ 0.02 on LB. depends on your model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 意味がわからない人へ\n", "\n", "どっちが簡単ですか?\n", "\n", "- 以下の選択肢から正解を **一つ** 選べ。\n", "- 以下の選択肢から正解を **すべて** 選べ。\n", "\n", "\\#あとは分かるな" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }