Skip to content

Commit

Permalink
feat: sort
Browse files Browse the repository at this point in the history
  • Loading branch information
xhiroga committed May 16, 2024
1 parent 37efff0 commit 619dd34
Show file tree
Hide file tree
Showing 7 changed files with 716 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 選択ソート"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"データの先頭(または後方)が徐々に整理済みになっていく点は、挿入ソートと変わらない。\n",
"\n",
"1番小さい(大きい)値、2番目に小さい(大きい)値…と順番に探していく点と、値が見つかった後は元々n番目にいた値が入れ替わりで飛ばされてしまう点が異なる。この入れ替わりで飛ばされてしまう点が面白いので、(ダンジョン飯の)ミスルンソートと個人的には呼びたい。"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def selection_sort(nums: list[int]) -> tuple[list[int], int]:\n",
" exchanged = 0\n",
" for i, num in enumerate(nums):\n",
" min_index = i\n",
" min_num = num\n",
"\n",
" # for j, challenger in enumerate(nums[i:]):\n",
" # WARNING: 配列のソート時、内側のループで部分配列を使わないこと。インデックスjが相対的な値であることを忘れてバグを埋めるため。\n",
" # WARNING: range(0, 5) は [0,1,2,3,4] である。第2引数 stop は出力されない。\n",
"\n",
" for j in range(i, len(nums)):\n",
" challenger = nums[j]\n",
" if challenger < min_num:\n",
" min_index = j\n",
" min_num = challenger\n",
" if min_index != i:\n",
" tmp = nums[i]\n",
" nums[i] = nums[min_index]\n",
" nums[min_index] = tmp\n",
" exchanged += 1\n",
" \n",
" print(nums)\n",
"\n",
" return (nums, exchanged)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"def parse(input: str):\n",
" lines = input.splitlines()\n",
" return [int(num) for num in lines[1].split(\" \")]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"input = \"\"\"5\n",
"5 6 4 2 1 3\"\"\"\n",
"assert(parse(input) == [5,6,4,2,1,3])"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1, 6, 4, 2, 5, 3]\n",
"[1, 2, 4, 6, 5, 3]\n",
"[1, 2, 3, 6, 5, 4]\n",
"[1, 2, 3, 4, 5, 6]\n",
"[1, 2, 3, 4, 5, 6]\n",
"[1, 2, 3, 4, 5, 6]\n"
]
}
],
"source": [
"actual = selection_sort(parse(input))\n",
"expected = ([1,2,3,4,5,6], 4)\n",
"assert(actual == expected)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "til_machine_learning_py312",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# シェルソート\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"挿入ソートでは、未整理のデータから取り出した値を、整理済みデータのどこに挿入すべきかがパッと分かると効率が良い。\n",
"\n",
"そこで、なるべく整理済みのデータの最後尾(または先頭)に位置づけられるように前処理をしてしまおう、というのがシェルソートである。\n",
"\n",
"(逆に、整理済みデータから挿入箇所を探すところを工夫したのがバイナリソート(二分探索挿入ソート)となる。)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"シェルソートでは、未整理の配列$A$を間隔 g ごとの配列 $A_{modg0} = [a_0, a_g, a_2g, ...], A_{modg1} = [a_1, a_{g+1}, ...], ...$ に分けたうえで挿入ソートを行い、ラフに前処理をする。\n",
"\n",
"この挿入ソートを行う際に、部分配列の最後尾に注目するやり方(螺旋本ではこちら)の他に、g 種類の部分配列ごとに挿入ソートを終わらせるやり方があると思われる。本実装は後者で行う。\n",
"\n",
"![シェルソート(螺旋本)](images/シェルソート(螺旋本).svg)\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from logging import getLogger, StreamHandler, DEBUG\n",
"\n",
"logger = getLogger(__name__)\n",
"handler = StreamHandler()\n",
"handler.setLevel(DEBUG)\n",
"logger.setLevel(DEBUG)\n",
"logger.addHandler(handler)\n",
"logger.propagate = False\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def insertion_sort(nums: list[int], g: int):\n",
" # mod(g)==0, mod(g)==1...の部分配列ごとに挿入ソートする。\n",
" for rem in range(0, g):\n",
" logger.debug(f\"{g=}, {rem=}\")\n",
" # 1周目は [rem, rem+g] を、2週目は [rem, rem+g, rem+2g] を...のように挿入ソートを行う。\n",
" last = rem + g\n",
" while last < len(nums):\n",
" logger.debug(f\"{g=}, {rem=}, {last=}\")\n",
" # challenger を g づつ減らしてループする。\n",
" challenged = last - g\n",
" while 0 <= challenged:\n",
" logger.debug(f\"{g=}, {rem=}, {last=}, {challenged=}\")\n",
" if nums[challenged] > nums[challenged + g]:\n",
" tmp = nums[challenged]\n",
" nums[challenged] = nums[challenged + g]\n",
" nums[challenged + g] = tmp\n",
" challenged -= g\n",
" logger.debug(f\"{nums=}\")\n",
"\n",
" last += g\n",
" logger.debug(f\"{nums=}\")\n",
"\n",
" logger.debug(f\"{nums=}\")\n",
"\n",
" return nums"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def shell_sort(nums: list[int]):\n",
" G_asc = []\n",
" h = 1\n",
" while h < len(nums):\n",
" G_asc.append(h)\n",
" h = 3 * h + 1\n",
" G = G_asc[::-1]\n",
" logger.debug(G)\n",
"\n",
" for g in G:\n",
" # 呼び出し先で再代入を行わない限りは、メソッドを跨いでも同じオブジェクトが参照されるため、ここで再代入を行わない書き方も可能である。\n",
" # しかし、呼び出し元がMutableを期待している時に、呼び出し先がImmutableな挙動だと、動かすまで誤りが分からない。\n",
" # したがって、呼び出し元でImmutableを強制し、バグを未然に防ぐ。\n",
" nums = insertion_sort(nums, g)\n",
"\n",
" return (len(G), G, nums)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def parse(input: str):\n",
" lines = input.splitlines()\n",
" return [int(num) for num in lines[1:]]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"input = \"\"\"5\n",
"5\n",
"1\n",
"4\n",
"3\n",
"2\"\"\"\n",
"expected = [5, 1, 4, 3, 2]\n",
"actual = parse(input)\n",
"assert expected == actual"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[4, 1]\n",
"g=4, rem=0\n",
"g=4, rem=0, last=4\n",
"g=4, rem=0, last=4, challenged=0\n",
"nums=[2, 1, 4, 3, 5]\n",
"nums=[2, 1, 4, 3, 5]\n",
"nums=[2, 1, 4, 3, 5]\n",
"g=4, rem=1\n",
"nums=[2, 1, 4, 3, 5]\n",
"g=4, rem=2\n",
"nums=[2, 1, 4, 3, 5]\n",
"g=4, rem=3\n",
"nums=[2, 1, 4, 3, 5]\n",
"g=1, rem=0\n",
"g=1, rem=0, last=1\n",
"g=1, rem=0, last=1, challenged=0\n",
"nums=[1, 2, 4, 3, 5]\n",
"nums=[1, 2, 4, 3, 5]\n",
"g=1, rem=0, last=2\n",
"g=1, rem=0, last=2, challenged=1\n",
"nums=[1, 2, 4, 3, 5]\n",
"g=1, rem=0, last=2, challenged=0\n",
"nums=[1, 2, 4, 3, 5]\n",
"nums=[1, 2, 4, 3, 5]\n",
"g=1, rem=0, last=3\n",
"g=1, rem=0, last=3, challenged=2\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=3, challenged=1\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=3, challenged=0\n",
"nums=[1, 2, 3, 4, 5]\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=4\n",
"g=1, rem=0, last=4, challenged=3\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=4, challenged=2\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=4, challenged=1\n",
"nums=[1, 2, 3, 4, 5]\n",
"g=1, rem=0, last=4, challenged=0\n",
"nums=[1, 2, 3, 4, 5]\n",
"nums=[1, 2, 3, 4, 5]\n",
"nums=[1, 2, 3, 4, 5]\n"
]
},
{
"data": {
"text/plain": [
"(2, [4, 1], [1, 2, 3, 4, 5])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shell_sort(parse(input))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "til_machine_learning_py312",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 619dd34

Please sign in to comment.